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ABSTRACT 


EFFICIENCY  AND  SPEED  IN  LEGGED  ROBOTICS 

by 

Paul  L.  Muench 


Advisor:  Ka  C  Cheok,  Ph.D. 

It  may  seem  preposterous,  but  walking  is  actually  faster  than  rolling  under 
certain  conditions.  Even  though  bio-inspired  robots  can  have  advantageous  mobility, 
they  can  also  be  very  inefficient.  This  idea,  the  balance  between  efficiency  and  speed, 
is  the  core  of  this  dissertation.  Inefficiencies  in  legged  robotics  are  due  to  many  things, 
e.g.,  collision  losses,  joint  friction,  etc.,  but  here  we  address  the  inefficiencies  due  to 
suboptimal  control  schemes. 

We  seek  a  control  scheme  that  will  decide:  1)  whether  it  makes  sense  to  walk  or 
not — we  seek  an  energy  regime  for  walking  that  depends  on  the  energy  already  in  the 
system,  and  then  2)  how  wide  we  should  make  our  stride,  i.e.,  the  optimal  stride  angle, 
and  finally  3)  an  optimal  switching  curve:  when  should  we  power,  and  when  should  we 
coast  to  achieve  the  greatest  possible  speed  at  the  least  energy  cost. 

This  dissertation  presents  control  schemes  for  a  lossless  “rimless  wheel”  model 
as  well  as  a  two-link  serial  robot  model  for  walking.  Using  Pontryagin’s  Maximum 
Principle  (PMP),  we  describe  the  cost  function,  the  state/co-state  equations,  and  the 
switching  conditions  for  these  models.  The  research  results  of  both  models  show  an 


v 


“on-off’  control  and  the  “switching  curve”  between  these  control  extremes.  It  is  not 
possible  to  find  a  complete  closed-form  solution  for  some  parts  of  this  problem,  and 


numerical  methods,  such  as  dynamic  programming  and  Nelder  Mead  optimization, 
must  be  used  for  simulation  and  visualization  of  the  results. 
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CHAPTER  1 


INTRODUCTION 


1.1  Transportation  Problem 

It  may  seem  preposterous,  but  walking  is  actually  faster  than  rolling.  Normally 
we  think  of  rolling  as  the  fastest  and  most  practical  means  of  transportation  while 
legged  systems  have  held  a  somewhat  elusive  promise  of  mobility  (Raibert,  1986). 
Until  recently,  most  research  in  legged  robotics  has  focused  on  stability.  Recent 
research  (Ruina,  1998)  has  come  to  the  conclusion  that  stability  is  a  necessary,  but  not 
sufficient  condition;  efficiency  is  the  key  to  unlocking  the  true  benefits  of  walking 
mechanisms.  To  introduce  the  problem  of  transportation,  we  adopt  this  useful  model 
shown  in  Figure  1.1.  Given  an  initial  energy  Eq  ,  move  a  mass  m  from  point  A  to  point 
B  in  time  t  expending  additional  energy  E. 


B 


Figure  1.1:  The  Transportation  Problem. 
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The  primary  goal  of  transportation  is  to  “move  the  mass  m  from  point  A  to  B.” 
The  most  efficient  thing  would  be  to  leave  the  mass  at  A  and  be  done  with  it! 
Admittedly,  that  is  not  the  most  useful  solution.  So  if  you’ve  resigned  yourself  to 
actually  move  the  mass,  now  what?  You  are  left  to  balance  the  cost  of  how  much 
energy  E  you  will  expend  vs.  how  much  time  t  it  will  take  to  do  the  job.  This  tradeoff 
will  be  the  subject  of  our  study  as  we  explore  the  transportation  problem — the  optimal 
control — of  two  distinct  models  of  walking:  1)  the  rimless  wheel — essentially  a  simple 
pendulum;  and  2)  the  compass  gait — essentially  a  double  pendulum. 

1.1  Brachistochrone  Problem 

Let's  begin  the  discussion  with  an  example  where  energy  is  conserved,  if  the 

energy  is  constant  (or  conserved),  the  goal  becomes  minimizing  the  travel  time  I — 

finding  the  least  time  path,  or  brachistochrone.  Consider  the  following  problem:  We 

wish  to  construct  a  ramp  such  that  our  mass  m  starts  at  rest  rolls  (or  slides  without 

friction)  from  point  A  to  point  B  in  the  least  amount  of  time.  The  temptation  is  to  draw 

a  straight  line  from  point  A  to  point  B,  since  obviously  the  shortest  geometric  distance 

between  two  points  is  a  straight  line.  The  only  problem  with  this  temptation,  of  course, 

is  that  it’s  wrong.1  Looking  again  at  Figure  1.1,  imagine  that  our  initial  energy  E0 

came  in  two  forms:  kinetic  and  potential.  If  we  first  start  from  rest  at  point  A,  all  of  our 

initial  energy  Eq  is  stored  as  potential  energy;  the  kinetic  bucket  is  empty.  Recall  that 

for  conservative  systems,  we  can  convert  energy  from  one  form  to  another  without 

penalty;  so  in  order  to  get  things  moving,  we’d  like  to  convert  some  of  that  stored 

1  The  solution  to  the  brachistochrone  problem  is  a  cycloid.  This  is  a  famous  problem  in  mathematics,  for 
the  analysis  of  it  led  to  the  formulation  of  the  calculus  of  variations  (Goldstein,  1 959). 
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•  2 

energy  into  kinetic  energy.  Not  only  that,  but  we’d  like  to  do  it  as  soon  as  possible. 

The  fastest  way  to  dump  potential  energy  is  to  free-fall  from  position  A  as  shown  in 
Figure  1 .2.  As  we  pick  up  speed,  we  track  our  course  back  towards  B.  Interesting  that 
the  fastest  way  towards  our  goal  is  to  head  away  from  it.  We  will  see  this  idea  again  as 
we  examine  the  rimless  wheel. 

1.3  The  Rimless  Wheel 

Consider  the  following  model  for  walking,  shown  in  Figure  1.3,  originally 
proposed  by  Tad  McGeer  (McGeer,  1990).  It  consists  of  a  mass  m  with  rigid  spokes 
extending  outwards,  like  a  wagon  wheel  without  the  outer  rim.  The  rimless  wheel 
pivots  and  collides  with  the  ground  on  rigid  spokes  instead  of  rolling,  behaving  as  an 
inverted  pendulum  between  intermittent  plastic  collisions  with  the  ground.  From 


Figure  1.2:  The  Brachistochrone  Problem. 


2  Pick  up  speed  sooner  rather  than  later,  so  that  the  time  savings  start  accruing  right  away.  Your  financial 
advisor  probably  has  similar  advice. 
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position  A,  we  pick  up  speed  as  in  the  brachistochrone  problem  by  trading  in  potential 
energy  for  more  kinetic  energy.  If  the  rim  were  present,  you  would  be  stuck  with  the 
kinetic  energy  you  began  with,  however  slow  that  might  be.  You  would  never  lose  any 
speed,  but  you  could  not  gain  any  either. 

What  we  gain  in  increased  average  speed  with  the  rimless  wheel  model  we  will 
pay  for  by  deviating  from  the  straight  line  path.  As  we  shall  soon  discover,  this  tradeoff 
will  have  a  point  of  diminishing  returns. 


Figure  1.3:  The  Rimless  Wheel 
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CHAPTER  2 


RESEARCH  BACKGROUND 

2.1  Passive  Dynamics 

Much  of  the  recent  literature  on  walking  mechanisms  has  focused  on  passive 
dynamic  walking,  or  the  ability  of  these  mechanisms  to  walk  based  only  on  physics,  not 
on  actuation  and  control.  This  concept  finds  its  origins  in  walking  toys  from  the  19th 
century  (Fallis,  1888).  These  toys  wobble  back  and  forth  as  they  periodically  swing 
their  legs  and  “walk’’  down  a  shallow  slope.  They  were  the  inspiration  behind  Tad 
McGeer’s  seminal  work  (McGeer,  1990),  the  analysis  of  a  class  of  walking  machines, 
wherein  gravity  replaces  the  energy  lost  from  friction  and  collision,  thus  creating  a 
stable  walking  gait  without  any  external  control. 

Andy  Ruina  and  his  group  at  Cornell  (Garcia.  Chatterjee.  Ruina,  &  Coleman, 
1998)  have  simplified  these  passive  dynamic  models  by  introducing  the  compass  gait. 
Since  that  time,  Art  Kuo  and  others  in  the  community  have  found  biological  data 
suggesting  that  these  methods  can  extend  to  human  walking  gaits  as  well  (Kuo, 
Donelan,  &  Ruina,  2005). 


2.2  Active  Dynamics 

While  this  notion  of  walking  as  the  passive  unfolding  of  system  dynamics  is 
elegant  and  intriguing,  a  real  world  application  of  these  ideas  requires  actuation  and 


5 


control.  Beginning  with  his  hopping  robots  at  the  MIT  Leg  Lab  (Raibert,  1 986),  Marc 
Raibert  has  advanced  the  state  of  the  art  in  legged  robot  control  for  practical 
application.  His  most  recent  creation,  the  BigDog  (Playter,  Buehler,  &  Raibert,  2006), 
is  the  most  advanced  legged  robot  in  the  world  for  outdoor  applications.  BigDog,  based 
on  high-gain,  high-bandwidth  control,  does  not  incorporate  the  aforementioned 
principles  of  passive  dynamics.  When  actuators  are  directly  coupled  to  the  legs,  this  is 
difficult  to  achieve.  Applying  a  clutching  mechanism  solves  this  problem  by  having 
both  active  and  passive  modalities.  Russ  Tedrake  used  optimal  control  and  learning 
algorithms  to  his  semi-passive  walker  (Tedrake,  2004).  The  robot  quickly  learned  how 
to  deal  with  the  uncertainty  of  its  under-actuated  control. 

Adding  a  virtual  clutch  is  another  way  to  achieve  both  active  and  passive 
actuation.  The  series  elastic  actuator  of  Gill  Pratt  (Pratt  &  Williamson,  1 997)  acts  as  a 
force  source  during  actuation  and  applies  zero  force  during  the  passive  swing  phase. 
This  series  elastic  actuator  has  been  the  central  component  of  several  robots  built  by 
Jerry  Pratt  and  his  co-workers  (Pratt  &  Krupp,  2004).  Jerry  Pratt  combined  this 
actuation  method  with  virtual  model  control  in  his  doctoral  thesis  (Pratt  J. ,  2000).  In 
the  research  which  follows  we  assume  some  type  of  clutching  mechanism — of  either 
the  virtual  or  actual  variety — in  order  to  take  advantage  of  both  active  and  passive 
control. 

2.3  Overview  of  the  Research 

Throughout  this  dissertation,  we  examine  the  rimless  wheel  without  collision 
losses.  Why  are  we  ignoring  collisions  when  they  are  necessary  for  limit  cycle 
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behavior?  First,  limit  cycle  behavior  is  not  a  necessary  condition  of  walking — you  do 
not  need  periodic  motion  to  complete  the  task  of  getting  from  point  A  to  point  B.  In 
reality,  the  necessary  condition  for  walking  is  simply  stability  (i.e.,  not  falling  down). 
Second,  stability  is  only  a  necessary,  but  not  a  sufficient  condition  for  walking.  As  we 
saw  in  the  brachistochrone  problem,  what  you  are  really  trying  to  achieve  is  efficiency 
and  speed!  If  walking  isn’t  the  fastest  way  to  get  there,  we  should  pick  another  mode  of 
transportation,  despite  this  author’s  preference.  Efficiency  and  speed  are  the  reasons  for 
walking  in  the  first  place  and  thus  they  are  the  focus  of  this  dissertation.  If  we  cannot 
justify  walking  transport  under  these  ideal  conditions  (lossless  collision),  then  it’s 
pointless  to  go  further.  If,  however,  we  find  justification  for  walking,  we  bear  in  mind 
that  this  is  a  theoretical  limit  which  can  be  approached,  but  never  achieved1. 

In  Chapter  3,  we  will  model  walking  along  a  constant  energy  line— the  passive 
rimless  wheel.  First,  we  show  the  stride  width  with  the  fastest  speed — the  optimal 
stride  angle.  Next,  we  find  the  band  of  energies  where  walking  makes  sense — the 
energy  regime  of  walking. 

In  Chapter  4,  we  examine  the  problem  of  optimally  powering  a  pendulum — the 
active  rimless  wheel.  We  show  an  optimal  solution  to  powering  the  walking  motion 
with  respect  to  time  and  energy  cost.  Using  Dynamic  Programming  and  Pontryagin’s 
Maximum  Principle  (PMP),  we  show  an  “on-off’  control,  and  we  describe  the 
“switching  curve”  between  these  control  extremes. 

1  similar  to  the  thermal  efficiency  of  the  Carnot  Cycle  of  an  engine 
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Finally,  In  Chapter  5,  we  extend  these  methods  to  the  compass  gait,  where  the 
dynamics  of  the  state  and  co-state  equations  cannot  be  solved  analytically,  and  we  rely 
on  a  direct  search  method,  the  Nelder  Mead  Simplex. 


S 


CHAPTER  3 


PASSIVE  RIMLESS  WHEEL 

3.1  The  Simple  Pendulum 

An  energy-based  derivation  of  the  simple  pendulum  equations  (Strogatz,  1994) 
provides  a  succinct  model  of  the  walking  behavior  between  collisions  of  the  rimless 
wheel.  If  we  ignore  friction,  this  is  a  conservative  system  as  shown  in  Figure  3.1.  We 

1  9  1  9  '  9 

derive  the  pendulum  equation  by  adding  the  kinetic  energy  T  =  —  mv~  =  —  ml  0  with 
the  potential  Energy,  V  =  -mgl  cos  6  to  yield  the  total  energy, 

1  9  ■  9 

E  =  T  +  V  =  —  ml  0~  -  mgl  cos  0  .  We  nondimensionalize  this  result  by  setting 

m=l  =  g  =  1 


m 

Figure  3.1:  The  Simple  Pendulum 
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(3-1) 


E  =  —92  -cos# 
2 


Along  constant  energy  lines  we  can  differentiate  both  sides  to  find 

dE 


dt 


=  0  =  #(#  +  sin  #) 


Now,  either  nothing  is  moving,  9  =  0 ,  or 


#  +  sin#  =  0 


(3-2) 


which  is  the  pendulum  equation.  Since  this  is  a  conservative  system,  we  can  plot  the 
1  *2 

energy  contours  E  =  —  9  -  cos  9  for  different  values  of  E  as  shown  Figure  3.2.  Note 

that  in  equation  (3.1),  if  we  set  the  value  of  9  =  0,  and  9  =  n ,  the  pendulum  is 
balanced  inverted.  You  can  see  that  this  energy  value,  E  =  1,  is  the  minimum  energy 
for  walking,  because  values  of  E  less  than  1  will  not  swing  past  9  =  n ,  that  is,  they 
cannot  whirl  over  the  top.  The  separatrix  appears  as  a  red  line  in  this  figure  at  energy 
level  E  =  1 .  This  contour  line  is  called  the  separatrix,  because  it  separates  two  distinct 
behaviors  for  the  pendulum:  whirling  over  the  top  or  oscillating  about  the  bottom. 
Energy  levels  above  the  separatrix,  E  >  1 ,  have  enough  energy  to  whirl  over  the  top, 
while  energy  levels  below  the  separatrix,  E  <  1 ,  oscillate  about  the  origin  at  9  =  0, 
never  quite  reaching  the  top.  Figure  3.2  shows  a  zoomed  in  view  of  the  phase  plane  of 
an  inverted  pendulum  about  its  unstable  node  at  n .  The  different  energy  levels  are 
labeled  on  the  contour  lines. 
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Figure  3.2:  The  Phase  Plane  of  the  Pendulum 


We’re  really  not  interested  in  the  entire  phase  plane  of  the  pendulum:  for  one 
thing,  the  rimless  wheel  cannot  dip  below  the  floor,  so  we  are  physically  limited  to 

,L<e<— 

2  2 

Practically  speaking,  we  would  like  to  look  at  an  even  smaller  stride  length  than  that. 
We  can  see  from  Figure  3.3  that  the  stride  angle  limits  the  phase  plane  even  further  to 
the  following  equation  (3.3): 
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Figure  3.3:  One  stride  length 


n-a<6<n+a  (3.3) 

Also  note  from  the  figure  above  that  the  stride  length  D : 

D  =  2sina  (3.4) 

3.2  Optimal  Stride  Angle 

Let's  assume,  for  the  moment,  that  you  could  create  a  rimless  wheel  that  had  no 
collision  loss.  We  could  argue  about  what  kind  of  device  could  possibly  have  zero 
collision  loss,  but  if  we  could  create  one,  the  bottom  line  is  this:  what  step  size,  a, 
would  be  the  fastest?  That  is,  what  step  size  would  give  us  the  fastest  average  forward 
speed,  v  ?  This  provides  a  rock-bottom  analysis  for  walking,  which  should  serve  as  a 
useful  guide  for  building  practical  systems.  For  the  simple  pendulum,  we  find  the 
swing  time  by  solving  for  9  in  equation  (3.1) 

9  =  E  +  cosO  (3.5) 
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Separation  of  Variables: 


n+a 


I  i 


n+a 


dO 


n-a 


e  V2  J  yj  E  +  COS  0 


(3.6) 


n-a 


The  above  equation  for  stride  time  T  can  be  stated  in  terms  of  elliptic  integrals,  as 
shown  in  the  next  section.  If  we  use  equation  (3.4)  for  the  stride  length  D  and  use 


equation  (3.6)  for  stride  time  T,  we  find  the  formula  for  average  speed:  v  =~  : 


v(E,a)  = 


2sino 


n+a 


(3.7) 


.Ft  J 


d6 


yfl  ■’  ylE+c.os6 
n-a 

After  integration  is  performed,  v  is  solely  a  function  of  the  stride  angle,  a,  as 
long  as  the  energy  E  remains  constant  during  the  stride.  To  find  the  fastest  stride-angle 
a  for  a  given  energy  contour,  E ,  we  set  the  derivative  of  (3.7)  to  zero  with  respect  to 
a,  using  Leibniz  rule,  while  holding  E  constant: 

2  sin  a 


n+a 


=  COS 


cos  a 


(3.8) 


dO 


n-a 


yj E  + COS  0 


Those  values  of  a  that  satisfy  equation  (3.8)  are  the  optimal  stride  angles  a  for  a  given 
energy  E.  Note  that  the  left  hand  side  of  the  above  equation  bears  remarkable  similarity 
to  the  average  speed  v  in  equation  (3.7).  We  will  explain  the  significance  of  that  fact 
over  the  course  of  the  chapter.  For  now,  we  plot  the  optimal  stride  angles  a*  over  the 
range  of  energy  contours  1  <  E  <  1 .5  as  shown  in  Figure  3.4. 
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Contour  Plot  with  Overlaid  Local  Maximum  Curve 


(X6J8U3)  3 
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Figure  3.4:  Plot  of  Fastest  Stride  Angles, 


3.3  Elliptic  Integrals 

We  see  from  Figure  3.4  that  the  optimal  step  size  goes  to  zero  as  we  approach 


E  =  1.5.  This  is  the  maximum  walking  energy,  a  fact  which  we  will  explain  later  in  the 
chapter.  In  this  section,  we  show  that  a  closed  form  solution  can  be  found  for  the  swing 
time.  The  equation  for  swing  time  (3.6)  can  be  described  by  elliptic  integrals  (Armitage 
&  Eberlein,  2006).  We  define  the  elliptic  integral  of  the  first  kind  as: 

dz 


yj\-k  sin  z 

The  complete  elliptic  integral  of  the  first  kind  is  also  defined  as: 

dz 


(3.9) 


a/5 i-k  sin  z 

This  form  can  be  seen  in  equation  (3.6)  by  substituting  the  trig  identity 

Q 

cos  0  =  1  -  2  sin  —  into  the  denominator  of  the  integrand: 


(3.10) 


y/E  +  cos  0  -  ^E  + 1  -2  sin  ^ 


■Ve+ Tji 


2  .  0 

- sin  — 

E  + 1  2 


(3.11) 


0  2 

Substituting  <j)  -  —  dd  =  2d<j)  and  k  =  — — -  into  equation  (3.1 1),  we  find  a  new 
form  of  the  swing  time  integral  (3.6): 


7t+a 


T  =  k  j 


d(j> 


J  I  9  ~  2 
n-a  yl  —  k  sin  i 


(3.12) 
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Now  we  can  write  the  swing  time  integral  in  terms  of  the  standard  elliptic  integrals 


defined  above.  To  do  this,  we  take  advantage  of  the  periodic  nature  of  .  . = 

\j\-k2  sin 2  tf> 

as  shown  in  Figure  3.5. 

All  the  information  in  the  integrand  is  contained  in  the  interval  (0  — »  y) .  Any 

other  interval  required  can  be  obtained  by  symmetry.  Since  swing  time  equation  (3.12) 
integrates  the  section  2 A  in  the  figure,  we  will  recast  section  A  in  terms  of  the  standard 
elliptic  integrals.  A  can  be  rewritten  as  A+F  -  F  where  F  is  the  elliptic  integral  of  the 
first  kind,  and  K=A+F  is  the  complete  elliptic  integral  of  the  first  kind.  Thus  T  =  2A  = 
2(K-F).  Now  the  swing  time  is  in  terms  of  the  standard  elliptic  integrals: 

T  =  2k{K[k2]- F\j>,k21^ 


Substituting  our  values  for  k  and  cj) : 


T-  2 41 

2 

rr 

n-a  2  ]] 

v^ir 

_£  +  !_ 

2  ’  £  +  ljJ 

where  F  is  the  elliptic  integral  of  the  first  kind  and  K  is  the  complete  elliptic  integral  of 
the  first  kind  as  defined  above. 

3.4  Cleveland  Problem 

In  the  analysis  which  follows  in  Sections  3. 5-3. 6,  we  examine  average  walking 
speed  using  the  position  average,  vg .  The  position  average  is  an  approximation  of  the 

time  average.  This  can  be  seen  most  clearly  with  what  we'll  call  the  “Cleveland 
Problem”:  Say  you  are  on  a  road  trip  from  Detroit  to  Cleveland.  Let’s  call  the  mid¬ 
point  of  the  trip  the  Ohio  border  in  Toledo  where  the  speed  limit  changes  from,  say, 
70mph  to  50mph.  What  is  your  average  speed  if  you  travel  the  speed  limit  the  entire 
trip?  If  we  take  the  position  average,  the  answer  is  60mph.  However,  you  are  actually 
spending  more  time  in  Ohio,  since  your  speed  is  slower  there,  so  the  time  average  is 
somewhat  less  than  60mph. 


3.5  Position  Average 

To  find  the  position  average  of  speed,  Vq  ,  we  first  need  to  find  speed  v  as  a 

function  of  6  .  Looking  at  Figure  3.6,  we  see  the  familiar  transformation  from 
cylindrical  to  Cartesian  coordinates.  The  instantaneous  speed,  v,  is  along  the  y- 
direction  in  this  figure  and  is  given  by  v(#)  =  r6 cos 6 ,  if  /-  =  l,then 

v(#)  =  <9  cos# 
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(3.14) 


X 


Figure  3. 6:  Pendulum  Walker 


We  know  from  past  work,  equation  (3.5),  that  6  -  yf2^J E  +  cosd  .  Substituting  this  into 


(3.14),  we  obtain  this  formula  for  the  instantaneous  speed  as  a  function  of  angle#: 


v(#)  =  yfly/E  +  cos  6  cos  0  (3.15) 


The  formula  for  instantaneous  speed  v(#)  is  not  an  approximation.  However,  if  we 


take  the  position  average  of  (3.15)  with  respect  to  6 : 


.  7T+CI 

vq  = —  [  -J2yj E  +  cos#  cos OdO  (3.16) 

2  a  i 
n-a 


This  is  an  approximation  of  the  time  average.  To  find  optimum  stride-angle  for  a  given 


energy  contour,  we  need  to  set  the  derivative  of  (3.16)  to  zero  with  respect  to  a.  holding 


E  constant  (using  Leibniz  integral  rule): 


.  7T+U 

—  f  'JE  +  cos  6  cos  OdO  =  -  cos  ciyj E  -  cos  a  (3.17) 

2  a  J 

n-a 
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This  is  a  similar  result  to  the  optimal  stride  angle  that  we  saw  in  equation  (3.8),  but  here 
we  will  show  the  details.  Leibniz’  rule  for  differentiating  through  the  integral  sign  is: 


a  Mfl)  h{a)  df  (6  a) 

—  J  f(0,a)d0  =  f[h(a),a]hXa)-f[g(a),a]gXa)+  J  V,  -—d6  (3.18) 

g(a)  g(a )  da 

Setting  the  derivative  of  the  average  speed  to  zero  requires  us  to  differentiate  under  the 
integral  sign  in  equation  (3.16): 


d_ 

da 


n+a  rz 

[  — yjE  +  cos  0  cos  Odd  =  0  = 
J  2  a 


n-a 


y] E  +  cos(;r  +  a)  cos(;r  +  a)  +  -^-  yj E  +  cos(;r  -  a)  cos(?r  -  a)  + . . . 


(3.19) 


n+a 


1  “ 

... — —  f  4i4e  +  cos  0  cos  6d6 
2 a2  J  . 


n-a 


Substituting  the  trigonometric  identity  cos  (  A  +  B)  =  cos  A  cos  B  -  sin  A  sin  B ,  that  is, 
cos(^-a)  =  cos(;r  +  a)  =  -cosa,  we  find: 


n+a 


—  ("  'JlJ  E  +  cos0  cos&  =  E -cosa  cosa-^-V-E-cosa  cos  a 

2a  ^  2  2 


V2 

2 


n-a 


(3.20) 


=  -\f2yjE  +  cos  (n  +  a)  cos  (n  +  a) 


which  is  the  formula  for  the  fastest  stride  angle  (3.17)  that  we  saw  before.  The 
interesting  thing  to  note  here  is  that  the  left  hand  side  of  the  equation  is  the  average 
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velocity,  as  we  saw  in  equation  (3.8),  and  the  right  hand  side  of  the  equation  is  the 
instantaneous  velocity 

vg=v{7:  +  a)  (3.21) 

What  this  is  telling  us  in  words  is  that  we  should  continue  to  take  a  larger  step  a  (which 
adjusts  the  average  velocity)  until  the  instantaneous  speed  equals  the  average  speed. 

We  can  see  this  graphically  in  Figure  3.7. 


(a) 


Figure  3. 7:  Instantaneous  forward  speed,  v(#) ,  plotted  in  blue  with  cosine  plotted  in 

green  for  energy  levels  (a)  £=1.01  (b)  £=1.25  (c)  £=1.5  {This  figure  is  presented  in 
color;  the  black  and  white  reproduction  may  not  be  an  accurate  representation) 
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1 


E=1  25 


(C) 

Figure  3. 7 -Continued 
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We  plot  the  instantaneous  speed  (3.15)  as  a  function  of# .  To  find  the  average  speed, 
simply  average  the  height  of  the  blue  graph  from  n -a  to  7T+a.  Then  vary  a  to  see 
how  this  affects  the  average. 

Starting  from  a  =  0,  to  build  speed,  we  fall  forwards  (increasing  a),  which  takes 
us  away  from  our  desired  forward  direction — we  are  taxed  with  a  cosine  penalty 
because  of  it — but  our  increased  average  speed  is  worth  the  penalty  at  first.  As  we 

7Z 

approach  a  =  —  however,  the  penalty  approaches  100%  (note  the  cosine  penalty  in 

green)  because  we  are  moving  towards  the  ground  and  not  moving  forward  at  all.  This 
creates  a  tradeoff:  take  a  step  (increase  a )  to  increase  speed,  but  don’t  increase  a  so 
large  that  you  nose  the  velocity  into  the  ground. 

In  Figure  3.7(a)  we  see  the  graph  of  v(#)  with  E  =  1.01;  the  pendulum  is 

barely  above  the  separatrix  which  means  that  it  slows  down  a  whole  lot  near  the  peak  at 
6  =  n  .  Here  we  take  a  relatively  large  step  to  increase  our  average  speed  by  taking  in 
the  higher  peaks  of  the  velocity  curve,  v,  to  the  left  and  right  of  6  =  n  .  At  higher 
values  of  E  as  shown  in  Figure  3.7(b),  we  see  the  local  minimum  of  v  at  6  -  n  become 
less  pronounced  until  finally,  in  Figure  3.7(c)  at  E-  1.5,  the  local  minimum  has  become 
a  global  maximum  and  it  no  longer  makes  sense  to  walk — similar  to  what  we  saw 
before  in  Figure  3.4. 

\p\  always  has  a  minimum  at  0  -  n ,  so  why  does  it  no  longer  make  sense  to 

walk?  At  higher  energies,  the  change  you  get  from  potential  to  kinetic  energy  is  chump 
change  compared  to  the  total  energy,  i.e.,  E  is  made  up  mostly  of  kinetic  energy  to 
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begin  with,  and  it  is  not  affected  much  by  changes  in  6 — note  how  the  contour  lines 
Figure  3.2  become  flatter  and  flatter  with  respect  to  6  as  E  goes  from  1  to  1.5.  This 
means  that  for  E  >  1.5,  any  deviation  from  straight  line  travel  from  point  A  to  point  B 
is  not  worth  the  price.  You  should  not  walk,  you  should  roll. 

Another  way  to  reason  about  optimal  step  width  from  equation  (3.21)  is  by 
looking  at  the  sampled  average: 


vn=~{v\+v2+---vn)  (3.22) 

n 

where  n  is  the  number  of  samples  taken.  How  would  you  know  if  we  should  be  taking 
a  bigger  step?  We  assume  that  the  step  is  centered  around  0  -n ,  and  we  iterate  a 
numerically  so  that  the  stepwidth  gets  incrementally  larger.  Thus  if 


v„+i>v„  (3.23) 

we  know  that  we  want  to  take  that  incrementally  bigger  step.  From  (3.22)  we  see  that 

,  W+l 

I 

1 


n  1 A  1 

=  — 7— Z.v/+— TTv»+i 
n  + 1  n  ,  n  + 1 


”  't 

I 

n  _  1 

vn  +— ~vn+\ 


n  + 1 


n  + 1 


so  if  we  substitute  v„+)  from  above  into  (3.23)  we  get: 


~vn+— 7V«+1  >vn 


n  +  1  n  +  1 


n  -  (n  +  1) 
n  + 1 


vn> 


n+  I 


rn+\ 
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The  condition  for  satisfying  (3.23)  is 

vw+1>v„  (3.24) 

In  other  words,  if  the  next  incremental  velocity  is  greater  than  your  average,  take  it. 

This  is  precisely  the  recipe  that  we  found  in  equation  (3.21)  for  the  best  position 
average  and  in  equation  (3.8)  for  the  best  time  average. 

We  see  this  in  Figure  3.8,  which  depicts  in  blue  the  right  half  of  the  graph  of 
instantaneous  velocity  that  we  saw  previously  in  Figure  3.7.  Since  those  graphs  were 
symmetric,  we  take  the  average  over  only  the  right  half  side.  Look  at  the  plot  of  the 
average  velocity  in  red.  It  increases  until  it  reaches  the  condition  in  equation  (3.24) 
where  the  curves  cross,  and  from  then  on  the  average  velocity  decreases. 


3.6  Energy  Regime  of  Walking  (1<  E  <  1.5) 

We  know  how  to  determine  the  minimum  walking  energy,  the  separatrix  where 
the  pendulum  just  makes  it  to  the  top  of  its  swing  at  E  =  1 .  We  substitute  6  =  n  and 


0  =  Ointo  equation  (3.1)  to  find  this.  We’ve  seen  clues  from  Figure  3.7  that  E  =  1.5  is 
the  maximum  walking  energy.  Is  there  another  way  we  can  find  this  value?  Let’s  start 
from  equation  (3.15)  for  instantaneous  forward  velocity.  If  we  take  the  derivative  of  v 
with  respect  to  6 ,  we  see  that 


dv  -Jl  .  sin  6 

—  = - cos  9  — p== 

d9  2  yJE  +  cos  9 


Simplifying: 
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(3.25) 


dv 

~dG 


V2  .  J  2£  +  3cos<9 
- sin^  ,  ■  — 

2  \dE  +  cos8 


We  are  interested  in  the  behavior  of  —  at  the  critical  point  0  =  n  because  that 

dO 


is  the  point — evidenced  by  Figure  3.7 — where  v  changes  from  a  local  minimum  to  a 
global  maximum  as  E  approaches  1.5.  There  is  always  an  extremum  there,  because 
clearly  sin  n  =  0 .  The  question  is  whether  that  extremum  is  a  maximum  or  a  minimum. 

if  ~  changes  sign  from  +  to  -  at  8  =  n ,  then  we  know  v  is  a  local  maximum  there. 


Note  that  sin  8  changes  from  +  to  -  through  8  -  n ,  so  what  happens  to  the  sign 
dv 

of  —  around  n  depends  on  the  value  of  E  in  the  numerator  and  denominator  of 
d8 

equation  (3.25).  E  >  I  so  the  sign  of  the  denominator  is  always  positive.  We  are  left 
with  one  culprit:  the  numerator  2£  +  3cos;r  ,  which  changes  sign  when  E=  1.5.  This  is 
the  maximum  walking  energy — the  point  where  v  changes  from  having  a  local 
minimum  to  having  a  global  maximum. 

Figure  3.8  shows  the  change  in  the  optimal  stepwidth  for  two  different  energy 
levels.  It  also  shows  that  the  optimum  stepwidth  occurs  where  the  instantaneous  speed 
equals  the  average  speed  (position  average). 
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E=1  01 


(a) 


E=1  25 


theta  (radians) 


(b) 


Figure  3.8:  Optimum  Stepwidth  at  (a)  £'=1.01  (b)  £=  1.25  {This  figure  is  presented  in 
color:  the  black  and  white  reproduction  may  not  be  an  accurate  representation) 
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CHAPTER  4 


POWERED  RIMLESS  WHEEL 

4.1  Powered  Pendulum 

We  develop  a  dynamic  programming  algorithm  to  calculate  the  minimum  cost 
in  time  and  energy  for  a  powered  simple  pendulum  given  by: 

0  +  sin6  =  u  (4.1) 

Powering  the  pendulum  adds  energy  to  the  system;  it  is  not  a  conservative 
system  anymore.  However,  we  can  calculate  the  change  in  energy  by  taking  the  time 
derivative  of  the  energy  equation  for  the  simple  pendulum  (3.1)  and  comparing  the 
result  to  the  powered  system  in  (4.1): 

—  =  —  f—  61  -cos$l  =  0(d  +  sin#)  =  uO  (4.2) 

dt  dt{2  J  v  ’ 

We  see  from  the  above  equation  that  at  lower  speeds,  9,  a  higher  torque,  u,  is 
required  for  a  given  energy  change,  A E .  For  the  dynamic  programming  analysis  which 
follows,  we  treat  the  powered  pendulum  as  a  discrete  system  able  to  “jump”  to  a  new 
energy  level  A E ,  or  coast  along  the  constant  energy  contours  of  the  pendulum  shown  in 
Figure  4. 1 .  Figure  4.2  shows  this  discretization  of  phase  space  into  a  directed  network. 
Starting  at  the  upper  right  of  the  graph,  we  have  two  choices  at  node  1 :  stay  on  the 
energy  level  you  are  currently  at — move  to  node  6,  or  increase  the  energy  level  by  one 
increment — move  to  node  7. 
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4.2  Principle  of  Optimality 

Both  Dynamic  Programming  and  Pontryagin’s  Maximum  Principle  are 
conceptually  based  on  the  principle  of  optimality:  If  a-b-d  is  the  optimal  path  from  a  to 
d,  then  b-d  must  be  the  optimal  path  from  b  to  d. 

The  optimal  path  for  a  multistage  decision  process  is  shown  in  Figure  4.3. 
Suppose  that  the  first  decision  (made  at  a)  results  in  segment  a-b  with  cost  Ja^  and  that 

the  remaining  decisions  yield  segment  b-d  at  a  cost  of  .  The  minimum  cost  Ja/ 
from  a  to  d  is  then: 


J ad  J ab  J bd 


(4.3) 


Proof  by  contradiction  (Kirk,  1970):  Suppose  b-c-d  in  Figure  4.3  is  the  optimal  path 
from  b  to  d;  then 


Jbcd  <  Jbd 

j|C 

J ab  +  Jbcd  <  Jab  +  Jbe  =  J  ad 


(4.4) 


but  (4.4)  can  only  be  satisfied  by  violating  the  condition  that  a-b-d  is  the  optimal  path 
from  a  to  d.  Thus  the  assertion  is  proved. 


4.3  Dynamic  Programming 

Dynamic  programming  is  a  numerical  technique  to  find  the  optimal  path 
through  a  directed  network.  A  directed  network  consists  of  a  set  of  nodes  and  directed 
arcs  between  the  nodes.  A  directed  arc  as  we  see  in  Figure  4.3  is  simply  an  ordered  pair 
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h 


a 


(b) 

Figure  4.3:  The  Principle  of  Optimality  (a)  and  its  Proof  (b) 

(a,b) ,  where  a  and  b  are  nodes.  The  length  or  cost  of  a  directed  arc  is  denoted  as  Jaf,  . 
In  our  optimization  problem  (Denardo,  1975),  we  wish  to  find  the  shortest  path  through 
the  network.  Let  be  the  minimum  travel  time  from  node  b  to  the  goal  d. 

We  interpret  J ^  as  the  travel  time  of  the  path  from  node  a  to  the  goal  d  that  first 

traverses  arc  {a,  b)  and  then  travels  as  efficiently  as  possible  from  node  b  to  the  goal  d 
as  we  see  in  Figure  4.4. 
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Figure  4. 4:  The  Minimum  Path 

According  to  the  principle  of  optimality,  in  order  for  fa  =  Jad*  to  be  the  optimal 

path  from  a-b-d,  the  path  must  be  optimal  as  well.  In  Figure  4.4,  we  use  this 

fact  to  minimize  the  path  from  a-j-d  over  all  j  where  (a,j)  is  an  arc.  Each  of  the  j 
nodes  has  a  corresponding  fj ,  which  is  the  best  path  from  j  forward— shown  in  dashed 

lines.  The  best  path  from  a  to  d  is  the  one  that  minimizes  the  sum  of  1)  the  immediate 
cost  J aj  and  2)  the  long  term  cost  fj — the  cost  of  the  state  you  put  yourself  in — 

assuming,  of  course,  that  we  behave  optimally  as  we  traverse  from  j  to  d.  This  is 
known  as  the  functional  equation: 

fa  ~  mtn  \jqj  +  fj }  (4.5) 
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We  can  use  this  to  work  our  way  backwards  through  the  network  similar  to  how 
you  would  plan  your  morning  based  on  your  known  start  time  at  work:  If  I  have  to  be 
at  work  by  0700  and  it  takes  30  minutes  to  drive  there,  I  must  leave  the  house  by  0630, 
and  it  takes  me  15  minutes  to  eat  breakfast  so.... it’s  called  back  timing,  you  solve  the 
problem  of  what  time  to  get  up  for  work  in  the  morning  by  starting  from  the  end.  Also 
note  that  we  solve  the  problem  of  finding  the  minimum  path  from  a  to  d  by  embedding 
it  into  the  more  general  problem  of  finding  the  minimum  path  from  any  point  j  to  d. 

We  use  the  discretization  of  phase  space  into  E  and  6  that  we  saw  in  Figure  4.2. 
At  each  node  we  have  two  choices — move  up  one  energy  level,  or  stay  on  the  energy 
contour  you  are  at.  Each  choice  has  an  associated  time  cost  At .  If  you  stay  on  the 
same  energy  contour,  your  only  cost  is  the  time  cost  At  to  go  from  node  a  to  b.  Keep  in 
mind  that  the  time  cost  At ,  varies  depending  on  where  you  are  at  on  the  graph  according 
AO 

to  the  equation  At  =  — where  we  use  equation  (3.5)  for  0 .  If  you  move  up  an  energy 
0 

level,  you  incur  a  cost  equal  to  the  time  cost  plus  the  energy  cost:  At  +  A E 

.  At  if  y=l 

1,1 )  = 

’  At  +  AE  if  j  =  2 
This  corresponds  to  an  arc  length,  where  there  are  now  only  two  choices  for  j — staying 
on  the  same  energy  level  means  that  j  =  1  and  j  =  2  means  we  are  bumping  up  one 
level.  To  be  compatible  with  the  numbering  scheme  shown  in  Figure  4.2  we  must  write 
b  =  a  +  j  +  4 ,  so  that  J becomes: 

Jaj  +fb  =  Cost(a,  j)  +  /  (a  +  j  +  4)  (4.6) 
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The  cost  matrix  Cost  (a,  j)  has  two  columns  and  as  many  rows  as  there  are  nodes. 

With  this  notation,  we  can  now  develop  an  algorithm  for  moving  backwards  through  the 
network  to  solve  the  functional  equation  (4.5): 
for  a  =  d:-l:l 

for j  =  1:2 

Total_Cost(a,j)=Cost(a,j)  +f(a+j +4); 

end 

f(a)  =min(T otal_Cost(a, :)); 

end 

For  example  let’s  examine  Figure  4.2  using  the  above  code  for  the  case  where 
a  =  1 .  We  are  at  node  1  and  we  are  trying  to  see  which  direction  arrow  we  should  take 
next  (go  to  node  6?  or  node  7?).  The  code  will  find  the  minimum  of 
{Co5'/(l,l)  +  /(6)]and  \Cost{\,2)  +  /(7)|  and  set  that  equal  to  /(l).  This  code  is 

thus  another  statement  of  equation  (4.5).  To  know  what  /(6)and  /(7)  are,  we  have  to 

work  backwards  through  the  code,  starting  at  the  end.  The  nodes  are  numbered 
sequentially  to  allow  for  the  above  code  to  work. 

In  Figure  4.5,  we  see  the  result  of  this  minimum  path  tree  for  any  point  in  phase 
space.  Basically,  the  graph  is  telling  us  what  should  be  intuitively  obvious:  at  lower 
energy  levels,  it  will  take  so  long  to  get  to  the  other  side,  you  might  as  well  spend  the 
energy  and  gain  the  time  savings  of  being  on  a  higher  energy  level  (hence:  higher 
speed)  as  quickly  as  possible.  What  might  not  be  as  obvious  is  that  this  logic  has  limits: 


33 


Figure  4.5:  Dynamic  Programming  for  the  Active  Rimless  Wheel 


there  comes  a  point  at  which  it  is  no  longer  worth  it  to  expend  the  energy.  The  graph 
clearly  shows  the  emergence  of  a  switching  curve. 

4.4  Pontryagin’s  Maximum  Principle 

Pontryagin’s  Maximum  Principle  (PMP)  is  the  second  method  we  will  use  to 
study  the  active  rimless  wheel.  PMP  is  a  continuous  time  analog  of  dynamic 
programming,  and  it  is  a  statement  of  mathematical  truth.  We  introduce  PMP  with  an 
economic  example  of  maximizing  profits. 
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Consider  the  decision  problem  of  a  firm  (Dorfman,  1969)  that  wishes  to 
maximize  its  total  profits  J  over  some  period  of  time.  The  firm  produces  profits  at  a 
certain  rate  (j)  depending  on  the  amount  of  (income  generating)  capital  stock  x(t)  at  the 
specified  date  and  decisions  u(t)  at  the  specified  date — rate  of  output,  price  of  output, 
etc. . .  The  total  profits  that  will  be  earned  over  a  period  of  time  are  given  by  summing 
up  this  cash  flow  function  over  time  to  yield  the  value  functional  (Weber,  2005)  . 

T 

j[x(0),«]  =  ^<f>{x,u)dt  (4.7) 

0 

J  is  a  functional  because  it  depends  upon  the  entire  time  path  w  of  the  decision  variable, 

not  just  the  ordinary  variable  u.  If  we  conjecture  that  an  optimal  path  u*  of  the 
solution  exists,  we  could  follow  that  path  to  obtain  the  optimal  value  function. 

J*[x(0)]  =  max  J[x(0),m]  (4.8) 

U 

Take  a  closer  look  at  this  expression.  This  maximum  depends  only  on  the  starting 
position  of  the  state  variable  x(0) .  It  does  not  depend  upon  the  control  variable  u{t) 

because  we  assume  that  we  are  behaving  optimally  from  x(0)  on.  We  are  doing  the 

best  we  can  with  the  hand  we  were  dealt,  and  thus  u  has  been  “maximized  out,”  and  J* 

is  now  an  ordinary  function  to  which  we  can  apply  standard  calculus.  Since  J* ,  the 
optimal  value  function  of  the  firm,  depends  only  on  the  initial  starting  point,  we  would 
like  to  measure  the  effect  of  changing  that  starting  point  x(0),  i.e.  we  seek  its 
dependence  on  state.  The  shadow  price  of  capital  or  co-state  is  defined  as: 
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(4.9) 


a/* 

z  = - 

dx 

The  co-state  measures  the  effect  of  an  extra  capital  unit  on  the  value  of  the 
firm1.  At  any  time  t,  the  value  of  the  firm  equals  the  shadow  price  of  capital  multiplied 
by  the  firm’s  capital  stock  z(t)x{t ) ,  At  each  point  in  time,  the  value  of  the  firm  is 
changing  by 

^-[z(t)x(/)]  =  zx  +  zx  (4.10) 

The  firm’s  value  changes  because  the  amount  of  physical  capital  changes,  zx  , 
and  the  value  of  this  capital  changes,  zx  .  Let  <f>  (x,u)  be  the  net  revenue  at  time  t, 

expressed  in  present  value  terms.  The  decision  maker2  chooses  the  time  path  of  output 
that  maximizes  the  sum  of  the  momentary  net  revenue  and  the  change  in  the  value  of 
the  firm  at  every  instant: 

<j>{x,u)  + zx  +  zx  (4.11) 

The  time  path  of  the  capital  stock  must  also  fulfill  the  state  equation. 

x  =  f(x,u)  (4.12) 

Substituting  this  into  (4.1 1)  yields: 


1  For  example,  the  shadow  price  of  a  fish  is  $10  if  adding  an  extra  fish  to  the  fish  stock  increases  the 
value  of  a  fishery  by  $10.  Similarly,  extracting  a  unit  of  ore  reduces  the  value  of  a  mine  by  the  ore’s 
shadow  price.  An  extra  capital  unit  adds  value  to  an  enterprise  because  it  contributes  to  current  and 
future  revenues. 

2  The  first  extreme  might  be  the  short-term  manager,  who  squeezes  out  every  ounce  of  production, 
sacrificing  the  long-term  viability  of  the  firm.  The  other  extreme  might  be  the  money  manager  of  a  trust 
fund,  who  in  an  effort  to  maximize  the  endowment,  refuses  to  give  out  income  to  the  intended 
beneficiaries. 
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(4.13) 


<f>  ( X,  u)  +  zf  (x, «)  +  zx 

Partial  differentiation  of  the  equation  (4.13),  and  setting  equal  to  zero  for  maximization, 
produces  two  conditions  that  the  optimal  time  paths  of  the  decision  (control)  variable  u* 
and  state  variable  x*  must  fulfill: 


— +  z— =  0 
du  du 


d<f>  df  .  . 

dx  dx 


(4.14) 


(4.15) 


These  equations  are  recognized  as  Pontryagin’s  Maximum  Principle.  The  first 
equation  is  the  maximum  principle  and  the  second  equation  is  the  co-state  equation. 

Another  way  to  motivate  PMP  following  (Dorfman,  1969)  is  to  proceed  exactly 
like  we  did  with  Dynamic  Programming  in  Figure  4.4;  we  maximize  the  immediate  cost 
of  a  decision  over  At ,  and  the  cost  of  proceeding  optimally  from  x(t  +  At)  on 


max  |^- A t  +  J*  [x(f  +  At)]} 

Since  J*  [x(w)J ,  we  use  the  chain  rule  to  take  the  derivative  of  (4.16): 


(4.16) 


d<f>  dJ  d  r  ,  x-i 

jr'At+-^T7r  W'+A')] 

du  ox  du1- 


(4.17) 


x(t  +  At)-x(t) 

We  use  the  Euler  approximation  of  the  state  equation  x  «  — - - - —  and  the  state 


At 


equation  (4.12)  to  solve  for  x(t  + At) «  x  +  /(x,w)At 
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—  -Af  +  z— ("x  +  /(x,m)-A^1 
du  du1  y  ’  J 


(4.18) 


where  we  have  substituted  the  definition  of  z,  the  co-state  equation(4.9).  Setting  the 
derivative  equal  to  zero,  we  can  cancel  out  the  A t  to  yield  PMP: 

du  du 

To  remember  these  formulas,  write  the  Hamiltonian: 

H  -  <p(x,u)+zf(x,u)  (4.19) 

where 


dH 

dz 


(state) 


dH 

dx 

dH 

du 


-z 

0 


(co-state) 

(PMP) 


(4.20) 


For  the  case  where  </>(x,u )  is  a  cost  we  are  trying  to  minimize,  and  since 
min^  =  max  {-</>) ,  we  merely  rewrite  the  Hamiltonian  as: 

H  =  -</>[x,u)  +  zf{x,u)  (4.21) 

4.4  Rocket  Car 

To  begin,  we  take  a  textbook  optimal  control  example,  the  rocket  car,  with  slight 
modifications  to  motivate  our  study:  Figure  4.6  shows  the  outline  of  the  procedure  that 
we  will  follow  for  the  rest  of  the  chapter — this  deliberately  simplifies  the  problem  by 
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leaving  out  potential  energy  changes  in  the  system.  We  want  to  reach  the  origin  with 
maximum  velocity,  instead  of  zero  velocity,  to  simulate  the  transport  of  mass  as  quickly 
as  possible  through  a  single  stride. 

In  the  textbook  treatment  of  the  rocket  car  problem,  the  cost  function,  <j> ,  is 
assumed  to  be  for  a  time  optimal  problem,  a  fuel  optimal  problem,  an  energy  optimal, 
or  a  mixed  case  of  these. 


PMP  predicts  one  switching  curve,  s 


Compute/  i 
before  s 


>i  J  =  AE  + At 

l 


- - , 

Compute  / 

/  =  At 

after  s 

1 

1 

Find  the  switching  curve,  s,  which  minimizes/ 


Figure  4. 6:  The  PMP  Research  Outline 


■J  ■*> 

The  standard  treatment  for  the  energy  cost  function  is  <f>  =  u  .  In  this  section, 
we  derive  a  treatment  for  energy  where  u  is  a  control  force  instead.  We  show  that  the 
cost  function  for  energy  is  <j>  -  uv .  For  the  non-continuous  cases  we  will  be  studying, 
PMP  takes  on  a  slightly  different  form  than  (4.20),  we  now  look  for  supremum  of  H, 
rather  than  its  derivative. 

Given: 


Figure  4. 7:  The  positioning  problem 


Required: 

Drive  the  mass  back  to  the  origin  at  x  =  0  with  maximum  velocity,  while  minimizing 
some  integral-type  cost,  J,  where: 

3  If  u  is  a  current  source,  then  P  =  iV  =  i(/7?)  =  i^R 
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(4.22) 


T 

J  =  J  <f>{x.,u)dt 
0 

In  this  problem,  we  choose  two  components  for  the  cost  (time  and  energy) 

J  =  \dt  +  \dE 

Substituting  for  power 


J  =  $dt  +  \^-dt 


r  dt 

'\ 

2^| 

— 

— 

mv 

J  dt  \ 

,2 

J 

dt 


Substituting  for  kinetic  energy  (we  have  no  change  in  potential  energy  here) 

rdE 
— a 
J  dt 

Noting  that  mv-ma-u : 

J  mvvdt  =  J  uvdt 

Finally,  substitute  the  above  result  into  (4.22)  to  yield  the  cost  integral: 


T 

J  =  ^(k  +  uv)dt  (4.23) 

0 

where  A:  is  a  positive  constant  to  measure  the  relative  importance  between  time  and 
energy  efficiency. 

1st  step  in  the  Pontryagin  Maximum  Principle:  Treat  cost  as  an  additional  state. 
The  state  equations  are  thus  given  by: 
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(tr  > 
*0 

' k  +  uv ^ 

''k  +  ux 2^ 

X  = 

*1 

= 

V 

= 

x2 

x  U  2 

V  u  > 

Since  we  want  to  approach  the  origin  with  maximum  velocity,  u  is  bounded  by 
zero(Kirk,  1970) 


(4.24) 


-1  <u  <  0 

In  other  words,  while  you  may  want  to  back  off  the  gas,  you  would  never  want  to  apply 
the  brakes. 

2nd  step  in  the  Pontryagin  Maximum  Principle :  We  introduce  the  co-state 
variables  to  form  the  Hamiltonian: 


H  =  Z0X0  +  ZjX]  +  z2*2 

(4.25) 

Substituting  (4.24)  we  see  that: 

H  =  Zq  (&  +  UX2  )  +  Z\X2  +  z2u 

(4.26) 

The  co-state  equations  are  prescribed  by  Hamilton’s  equations: 

m  n 

z0  =  ^  =0 

OXq 

(4.27) 

m 

zi  = - =  0 

dx\ 

(4.28) 

dH 

z2=  ,  =  z\  Z0U 
dx2 

(4.29) 
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3rd  step  in  the  Poyntryagin  Maximum  Principle.  Solve  the  above  co-state 
equations.  The  equation  for  z0  shows  that  z0  =  const ,  and  the  PMP  requires  that  this 

constant  should  be  negative.  Without  loss  of  generality  we  can  choose  z0  =  -1 .  The 
two  solutions  to  (4.27)  and  (4.28)  are  thus: 

zq  —  — 1  zx=A  (4.30) 

where  A  is  some  constant  of  integration.  The  solution  to  equation  (4.29)  is  given  by 
substituting  (4.30),  which  yields 

i2  =  u-  A  (4.31) 

and  the  solution  becomes 

z2  =  J  udt  -At  +  B  (4.32) 

Noting  again  that  u  =  mv,  we  know  that  J  udt  =  mv-  mv0  ,  so  that 

z2  =  mx2  +  B-  mvQ  -  At  (4.33) 

where  B  is  another  constant  of  integration. 

4th  step  in  the  Poyntryagin  Maximum  Principle:  Find  the  supremum  of  Has  a 
function  of  u.  The  Hamiltonian  is  given  by  substituting  (4.30)  and  (4.33)  into  (4.26): 

H  =  -k-  wx2  +  Ax2  +  u {mx2  +B- mvQ  -  At )  (4.34) 

To  maximize  Has  a  function  of  u,  we  must  maximize  the  term  uq  where 

q  =  (w-l)x2  +  B-mvQ  -  At  (4.35) 
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If  m=\  then  q'  =  B'- At  where  B'  =  B- mvQ and  since  q'  is  a  linear  function  of  t  there 
is  at  most  one  zero  crossing  for  q ' .  Thus 


u  = 


-1  if  q'<  0 

0  if  q'>  0 


(4.36) 


PMP  has  thus  shown  that  we  have  “on-off’  control  with  a  single  switch.  This 
makes  intuitive  sense,  since  we  would  want  to  apply  the  force  early  on  in  order  to  gain 
the  time  savings  accrued  throughout  the  stride.  However,  we  would  want  to  shut  the 
force  off  after  some  time  period  in  order  to  save  energy. 


We  desire  to  find  the  optimal  control  u* .  From  the  get  go,  we  assume  that  the 
optimal  solution  is  “on-off’  as  described  in  the  previous  section.  First,  we  proceed  with 
a  bang,  i.e.,  from  the  initial  position  to  the  switching  position  s  we  apply  full  reverse 
force  (u  =  -1) ,  and  then  at  s  we  cut  the  engines  off  ( u  =  0)  and  coast  through  the  origin 

with  a  cruising  speed  of  vc  as  shown  in  the  phase  plot  in  Figure  4.8.  Here’s  the  key:  we 

can  find  the  switching  position  s  by  re-writing  the  cost  integral  (4.23)  as  a  function  of  x 
instead  of  t: 

T  x2  ^  x2 

J  =  ^{k  +  icv)dt  =  J  k - 1-  |  udx  (4.37) 

0  xl  V  x\ 

where  we  have  used  the  the  definition  of  work  is  =  J  F  •  =J  udx . 
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V 


For  the  “bang”  section  of  travel,  we  incur  both  time  and  energy  cost.  Working 
backwards  from  the  initial  position  X  at  zero  velocity  until  the  switch  at  position  s  and 
velocity  vc ,  the  time  cost  is: 


km 


\  —  =  km\  —  =  -kmv 
{  u  [  -1 


dv 


(4.38) 


which  uses  the  physics  result  ads  =  vdv  recast  as  udx  =  mvdv .  Recall  that  the  velocity 
is  negative  (so  the  above  cost  is  positive).  The  energy  cost  for  this  period  of  travel  is: 


5  vc  { 

j  udx  =  m  J  vdv  =  —mvc 


(4.39) 


For  the  “off5  section  of  travel,  there  is  only  time  cost.  We  cruise  at  velocity  vc 


from  position  s  to  the  origin,  so  the  time  cost  is: 
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(4.40) 


ks 


By  adding  together  the  above  three  equations  we  get  the  optimal  cost: 


=  -kmvc 


+  —mv, 
2  ‘ 


(4.41) 


We  desire  a  result  that  minimizes  J*  (vc) : 


dJ 

dv„ 


=  0  =  -km  +  mvc  -  k 


s 

V. 


(4.42) 


yC  J 


where  we  must  use  implicit  differentiation,  since  s  and  vc  are  related.  We  find  that 
relation  by  integrating  udx  =  mvdv : 


(4.43) 


we  find  the  relation: 


s  =  X — mvr2 
2  c 

and  the  derivative  of  5  with  respect  to  vc : 


(4.44) 


s '  =  -mvc 


which  we  substitute  into  (4.42)  : 


(4.45) 


0  =  -km  +  mvc  -  k 


-mvc  s 


2 

vc  J 


(4.46) 
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to  find  the  switching  curve 


s 


m 

— —  v„ 


(4.47) 


This  equation  is  plotted  in  Figure  4.9,  where  we  compare  it  to  dynamic  programming 
results: 


4.6  Powered  Pendulum 

As  stated  in  the  beginning  of  Chapter  3,  between  foot  collisions,  the  simple 
pendulum  provides  a  succinct  model  of  the  behavior  of  the  rimless  wheel,  and  as  we 
saw  in  Section  4.3  on  dynamic  programming,  there  is  evidence  that  powering  the 
rimless  wheel,  i.e.,  the  powered  pendulum,  has  a  switching  curve.  Proceeding  along  the 
lines  of  Section  4.4  on  the  rocket  car,  we  will  now  examine  the  switching  curve  of  the 
powered  pendulum  shown  in  Figure  4.10.  Required :  Drive  the  pendulum  from  n-a  to 
n  +  a  with  maximum  speed,  while  minimizing  the  energy  expenditure.  In  other  words, 

T 

we  would  like  to  minimize  the  integral  cost  J  =  ^<f>(x,u)dt  where  x  is  the  state  vector 

o 

and  (j)  is  the  cost  function.  As  before,  J  is  broken  up  into  two  parts:  time  cost  and 
energy  cost: 

J=\dt+\—dt 
J  J  dt 
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Velocity  (unitless) 


Energy-Optimal  Rocket  Car  Switching  Curve  (Dynamic  Programming) 


Figure  4. 9:  Switching  Curve  for  the  Rocket  Car  Problem 


s 


Figure  4.10:  Optimal  “on-off’  control  for  the  powered  pendulum 
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Substituting  (4.2)  into  the  above,  with  a  positive  constant  A:  as  a  measure  of  the  relative 
importance  between  time  and  energy  cost : 


T 

J  =  \(k  +  u6)dt  (4.48) 

0 

1st  step  in  the  Poyntryagin  Maximum  Principle:  Treat  the  cost  J  as  an  additional 
state  (Hocking,  1991) 


x0 

fj' 

X  = 

*1 

= 

9 

The  state  equations  are  thus  given  by: 


(x  ) 

(t) 

'  k  +  u9  " 

'  k  +  ux2  ' 

X, 

= 

9 

= 

* 

= 

x2 

k 

u  -  sin  9 

Ku  -  sin  x,  y 

(4.49) 


Since  we  want  to  reach  our  goal  with  maximum  velocity,  the  control  is  bounded  by 


zero: 


0<w<l 

In  other  words,  you  may  want  to  back  off  the  gas,  but  you  would  never  want  to  apply 
the  brakes. 

2nd  step  in  the  Poyntryagin  Maximum  Principle:  We  introduce  the  co-state 
variables  to  form  the  Hamiltonian: 


H  =  z0x0  +  zxxx  +  z2x 2 
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Substituting  (4.49)  we  see  that: 


//  =  z0  (&  +  «x2  )  +  zjx2  +  z2  (u  -  s^n  *1 )  (4.50) 

The  co-state  equations  are  prescribed  by  Hamilton’s  equations: 


dH  n 
z0=  .  =0 
dx0 

(4.51) 

dH 

(4.52) 

Z\= - =  z2  COS  Xj 

dx\ 

dH 

(4.53) 

z2=  -  =  zl  Z0U 

dx  2 

3rd  step  in  the  Poyntryagin  Maximum  Principle 

Solve  the  above  co-state  equations.  The  equation  for  z0  shows  that  z0  =  const ,  and  the 
PMP  requires  that  this  constant  should  be  negative.  Without  loss  of  generality  we  can 
choose  z0  =  -1 .  The  solution  to  (4.52)  is  coupled  with(4.53),  and  without  solving  both 

equations,  we  cannot  guarantee  that  the  switching  function  is  unique. 

4th  step  in  the  Poyntryagin  Maximum  Principle:  Find  the  supremum  of  H  as  a 
function  of  u . 

H'>H 

~^k  +  9u*  j  +  Z\9  +  z2  -  sin  9  j  >  -  {k  +  Gu )  +  z^G  +  z2  («  -  sin  9 ) 

w*(z2-^w(z2-^) 
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To  maximize  H,  we  note  the  sign  of  term  in  parenthesis.  If  it  is  positive,  we 
multiply  it  by  the  maximum  positive  control  torque.  If  it  is  negative  we  multiply  it  by 
the  minimum  control  torque: 


u  = 


1  if  z2  -  0  >  0 


|0  if  z2-9<0 
PMP  has  thus  shown  that  we  have  “on-off’  control  switch.  This  again  makes 
intuitive  sense,  since  we  would  want  to  apply  the  control  torque  early  on  in  order  to 
gain  the  time  savings  accrued  throughout  the  stride,  but  we  would  want  to  shut  the 
torque  off  after  some  time  period  in  order  to  save  energy. 


(4.54) 


We  desire  to  find  the  optimal  control  u* .  From  the  get  go,  we  assume  that  the 
optimal  solution  is  “on-off”  as  described  in  the  previous  section.  First,  we  proceed  with 
a  bang,  i.e.,  from  the  initial  position  n-aio  the  switching  position  s  we  apply  full 
torque  (u  =  l) ,  and  then  at  s  we  cut  the  engines  off  ( u  =  0)  and  coast  through  the  n  +  a 
as  shown  in  Figure  4.11. 

Here’s  the  key:  we  can  find  the  switching  position  s  by  re-writing  the  cost 
integral  (4.48)  as  a  function  of#  instead  of  t : 


T  S  JiO  TT+Cl  j/3 

J(s)  =  ^[k  +  u0^dt  =  k  |  -jr-+  |  ud6  +  k  J  — (4.55) 

0  7r-a  n—a  s 

The  first  term  is  the  time  cost  before  the  switch,  the  second  term  is  the  energy 

cost  before  the  switch  using  the  definition  of  work  of  a  couple.  E  =  J  ud6 .  The  third 

term  is  the  time  cost  after  the  switch,  and  there  is  no  energy  cost  after  the  switch. 
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£ 


Figure  4.11:  The  Switching  Curve  for  the  Powered  Pendulum 

To  determine  9,  we  must  find  an  equation  for  the  energy,  E(9),  as  a  function  of  9.  At 
the  initial  position#  -n -a,  our  initial  energy  is  Eq ,  and  we  add  energy  according  to 
the  definition  of  work  above  until  we  reach  s: 

9 

E(0)  =  Eq+  J  ud6  =  EQ+0-7r+a  for  n-a<Q<s  (4.56) 

n-a 

We  substitute  this  result  into  equation  (3.5),  9  =  yfl^E  +  cos0  : 

9  =  y[2yjE0  +9  -  n  +  a  +  cos  6  (4.57) 

Plugging  this  into  (4.55),  we  find  that 
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r/ „\  k  r  de 

J  CO  / —  I  r~  '  ' 

v2  _  JEo+0-7r  +  a  +  cos0 


+  s-7r  +  a  +  . 


n-a 

7t+a 


(4.58) 


k  r  aa 

V2  J  JK  +s-n+a+  cos0 


To  find  the  minimum  of  J(s ) ,  we  take  the  derivative  with  respect  to  s,  and  set 


it  equal  to  zero:  —  =  0  = ...  to  find  this  derivative,  we  must  use  Leibniz’  Rule  for 
ds 


differentiating  through  the  integral  sign: 
d  b 

—  j  g(0,s)d0  =  g[b(s),s]b\ 


i  b(s)  b(s)  *  (n  \ 

—  J  =  J  '  .  ~d0  (4.59) 


Using  Leibniz  Rule,  —  =  0  becomes: 
ds 


...  +  —j=  j  (Eq  + s-7r  +  a  + cos 0)  2/2  d0  =  O 


Thus,  we  find  that  the  switching  curve  must  satisfy  the  integral  equation: 


— j  (£0  +s-7T  +  a  +  cos#)  212  d0  =  1 


(4.60) 


To  graph  this  on  a  phase  plot,  as  shown  in  Figure  4.12,  we  find  s  that  satisfies  the  above 
relation,  and  then  we  plug  it  into  (4.56)  then  (4.57)  to  find  the#  that  corresponds  to  the 
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switching  curve,  s  If  we  just  used  Eq  to  find  0,  we  would  not  take  into  account 

the  gain  in  energy  that  occurs  during  the  powered  phase  before  the  switch. 

In  this  chapter,  we  have  found  a  switching  curve  for  the  powered  rimless  wheel, 
enabling  the  control  of  a  rimless  wheel  walker  to  efficiently  and  quickly  move  from 
n  -  a  to  n  +  a  during  the  stance  phase.  Figure  4.12  assumes  a  given  step  width,  but 
this  control  is  optimal  no  matter  where  you  are  in  phase  space.  If  you  happen  to  move  a 
little  off  the  optimum  path,  you  merely  look  up  your  position  on  the  map  and  either 
power  or  coast  depending  on  which  side  of  the  swithing  curve  you  are  on.  In  the  next 
chapter,  we  will  extend  these  concepts  to  a  multiple  link  walking  model  where  analytic 
methods  do  not  suffice,  and  numerical  methods  must  be  used. 
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Angular  velocity  (unitless) 


Energy-Optimal  Pendulum  Switching  Curve  (DP  vs  theoretical) 


Figure  4.12:  Switching  Curve,  Equation  (4.60)  is  shown  as  a  solid  line,  compared  with 
dynamic  programming  results  (shown  in  circles)  {This  figure  is  presented  in  color:  the 
black  and  white  reproduction  may  not  be  an  accurate  representation ) 


55 


CHAPTER  5 


TWO-LINK  MODEL  OF  WALKING 

5.1  Double  Pendulum 

To  complete  our  study  of  walking  behavior,  we  will  examine  the  extension  of 
the  principles  outlined  in  Chapters  1  through  4  to  more  complicated  models  of  walking. 
While  we  have  used  simple  models  to  grasp  the  underlying  principles  of  walking, 
clearly,  in  the  design  of  a  practical  walking  machine,  we  will  need  a  more  complete 
model  of  the  physical  hardware  to  validate  our  ideas.  We  seek  to  verify  the  extension 
of  these  principles  into  more  complete  models  of  the  dynamics. 

In  this  chapter,  we  study  a  two-link  model  of  walking — the  double  pendulum. 
This  model  is  called  the  compass  gait  in  the  literature  (Garcia,  Chatterjee,  Ruina,  & 
Coleman,  1998).  Again,  we  describe  the  swing  phase  of  the  gait  before  foot  collision. 
Looking  at  Figure  5.1,  we  can  derive  the  equations  of  motion  using  the  Denavit- 
Hartenberg  (DH)  convention  (J.  Denavit,  1955).  The  new  x-axis  must  be  perpendicular 
to  (DH1)  and  intersect  (DH2),  the  old  z-axis.  For  planar  systems,  DH2  suffices.  This 
convention  defines  the  homogeneous  transform  (translation  plus  rotation)  from 
coordinate  frame  (x0,y0)  to  as  follows  following  (Mark  W.  Spong,  1989): 

C1  ~sl  a\c\ 

A  o  =  cj  ajs,  (5.1) 

0  0  1 
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The  superscript  1  on  Aq  refers  to  coordinate  frame  1  and  the  subscript  0  refers  to  frame 
0.  We  are  using  the  shorthand  notation  q  =  cos  (<9j )  and  q  =  sin  (^  ) .  The  link  lengths 


are  q  and  a2  respectively.  This  notation  has  the  form:  Aq  = 


Ro  do 


where  R2x2is 


the  rotation  matrix  and  we  note  that  Rq  =  Rz  q  is  a  rotation  about  the  z-axis;  dg  = 


*0 


is  the  position  vector;  and  0jx2  is  the  zero  vector — the  last  row  of  the  matrix  Aq — it  is 
for  computational  compatibility  only.  In  our  case,  we  are  limited  to  the  plane,  so  A  is  a 
3x3  matrix  instead  of  a  4x4  matrix.  The  homogeneous  transform  from  (x1,iy] )  to 

(x2,y2)  looks  similar  to  equation  (5.1): 


0 


~s2  a2c2 
c2  a2s2 
0  1 


(5.2) 


We  must  matrix  multiply  the  transforms  together  to  get  the  composite  transform  from 
(*0>To)to  (*2>T2): 


»  2  A1  i2 

Aq  -  A0Aj  - 


c12 

■^12 

0 


~512 

c12 

0 


a2c12  +alcl 
a2su+alsl 
1 


Here  we  are  using  a  shorthand  notation  for  two  trigonometric  identities: 


q  2  =  cos  +  e2  )  =  qc2  -  s{s2 
S12  =  sin  (0\+02)  =  s\c2  +  c\s2 


(5.3) 


(5.4) 
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//////// 


Figure  5.1:  The  DH  Double  Pendulum 


To  find  the  equations  of  motion  for  this  system,  we  will  use  Lagrange’s  equation 
(Goldstein,  1950): 


dfdL ' 
dt  v  36}  y 


—  =  «;  for  /  =  1,2 
3Gi  1 


(5.5) 


The  Lagrangian  is  defined  as  ,L  =  T-V  where  T  is  the  kinetic  energy  and  V  is  the 
potential  energy  and  u  is  the  torque  about  each  respective  joint.  The  kinetic  energy  is 
given  by 


+ 


(5.6) 
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where  vz-  is  the  velocity  of  mass  /.  Finding  the  velocities  requires  a  bit  of  work.  The 


velocity  of  frame  1  with  respect  to  frame  0  is  found  by  taking  the  time  derivative  of 
as  follows: 


4  = 


KalslJ 


1 

0 


Jl. 


'-a\sA " 

a\C\6\  j 


(5.7) 


9  9  9  9 

Speed  is  the  vector  length  of  velocity  given  by:  v  =  x  +  >>  .  To  find  Vj  in  equation 
(5.6),  we  sum  the  squares  of  the  terms  in  (5.7),  seeing  that: 


v\2=a\2(sl2+<:l2)0l2=a\20\2 


(5.8) 


This  agrees  with  the  result  v  =  ra  from  Physics.  We  go  through  a  similar  procedure  to 
find  the  velocity  of  frame  2  with  respect  to  frame  0:  We  note  from  (5.3)  that 


»2  ' 
“0  - 


f  y2^ 
x0 


yyoj 


f  a2c\2  +  a\c\  ^ 


which  we  can  see  more  clearly  from  Figure  5.2.  As  in 


a2s\2  +a\S\j 

(5.7),  we  find  the  vector  relation  for  velocity  of  the  second  mass: 


2  \2 
0  -do  ~ 


V 

f 

J2) 

V 

~a2s\2^\2 


a2c\2^\2  +<*\C\0\ 
where  0\2  -  @\  +  #2 »  an<3  squaring  the  terms  above: 


(5-9) 


X2  -a2s\2®\2  +  ^a\a2s\s\2^A\2  +  a\S\9\ 

y2  =«22c122^122  +^ala2c\c\2^Al2  +a\C\0\ 


Sum  the  squares: 
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a2sl2 


Figure  5.2:  Geometric  Relations 


v22  ~  x2  +>’22  ~  a\®\  +al'^\2'  +  2axa2c2GxGX2 


(5.11) 


where  we  have  used  the  trigonometric  identity: 


*1*12  +  clc12  =  *1  (slc2  +  C\S2  )  +  q  (qc2  -  5^2  ) 

= sxc2 + s)F^2 + q2c2  -  sy^2 
=  c2 


rp  1  2^2  ,1 

T  =  -mxax  Gx  +-m2 


The  potential  energy  is: 


V  =  mxgy\  +  m2gyl 


=  ™Xg  (qq  )  +  ™2g{axsx  +  a2sx2  ) 


We  now  find  the  Lagrangian: 


(5.12) 


Substituting  (5.1 1)  and  (5.8)  into  (5.6),  the  kinetic  energy  is: 

a\6\  +a22&122  +  2ala2c2(&l2  +  ^1^2  )  (5-13) 


(5.14) 


L  =  ^ml2al2#l2  "*■  T m2a2®\2  +  m2a\a2c2  +  ^2)- m\2Sa\s\  ~™2Sa2s\2  (5-15) 
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where  mX2  =  mx  +  m2 .  Taking  its  derivative  with  respect  to  the  generalized  coordinate 


=  —m\2ga\C\  -m2ga2su 


(5.16) 


Take  the  derivative  of  L  with  respect  to  (\  : 


^  J  ^  ^  •  a  •  /  9  m  \ 

~^-  =  m\2a\  0\+m2a2  0X2+m2ala2c2(20l+02) 


(5.17) 


Now  take  the  time  derivative  of  (5.17): 


1  f  QT 

—  +W2«2242 +WJ2ala2c2  +^2)“ (2^1 +^2)^2  (5.18) 


Combining  equations  (5.18)  and  (5.16)  we  find  the  first  equation  of  motion  (5.5): 

\mnaf  +  m2a2  +  lm2axa2c2  ]  9X  +  \jn2a2  +  m2axa2c2  02 
+  [-m2axa2s2  fafy.  +  02  )  j  +  [mx2gaxcx  +  m2ga2cX2\  =  ux 


(5.19) 


We  can  put  the  above  into  standard  form: 


Mx  X0X  +  MX292  +Vx+Gx  =ux 


(5.20) 


where  the  terms  in  brackets  in  (5.19)  are  the  respective  terms  above. 

Now  we  must  take  the  derivative  of  L  with  respect  to  the  generalized  coordinate 


■  =  -m2axa2s2 


"*2Sa2c12 


(5.21) 


Take  the  derivative  of  L  with  respect  to  02  then  the  time  derivative  of  that  result: 
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dL  2a  a 

— r  =  m2a2  &\2  +  m2a\a2c2@\ 
CU') 


(  dL  ^ 


dt 


(5.22) 


\de2) 


2 -A 


=  m2Ci2~@\2  +  m2a\a2c2@\  ~ m2a\a2s2®\®2 


Combining  equations  (5.21)  and  (5.22)  we  find  the  second  equation  of  motion  (5.5): 


m2a2 


\^rn2.a2  +  ^2a]a2c2^\  + 

+  ^/W2«i«2a'2^12  ]  +  [m2Sa2cl2\  =  u2 


(5.23) 


Again,  we  put  the  above  into  the  standard  form: 

Mj\0\  +  ^22^2  +  F2  + G?  =  11 7  (5-24) 

where  the  terms  in  brackets  in  (5.23)  are  the  respective  terms  above.  Finally  we  can  put 
the  equations  in  matrix  form 

M.^\(  a  \  (v.\  f  G,\  f  u,  \ 

(5.25) 


fM\\  Mx  2n 

+ 

(V\) 

+ 

(  u  \ 

U\ 

yMi\  ^22; 

Jh) 

,G2, 

dl2y 

5.2  Trajectory  Optimization  using  PMP 
As  we  see  in  Figure  5.3,  we  re-cast  our  equations  in  state  form,  where: 


~J  “ 

’*o~ 

0\ 

Xj 

02 

= 

x2 

0\ 

x3 

p2_ 

_x4_ 

(5.26) 
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///// 

Figure  5.3:  Trajectory  Optimization  Problem 

Required:  Drive  the  second  mass  back  to  the  position  x,  =  x2i/ ,  while  minimizing  the 
total  cost,  J,  where: 

T 

J  =  h(Xf)+  J  (p{x,n  )dt 
0 

Substituting  values  for  the  final  cost,  h.  and  the  integral  cost,  (j) : 

2  7 

J  =  M [*2f  - x2df  )  +\(k  +  u]xl+u2x2)dt  (5.27) 

0 

In  this  problem,  we  chose  the  cost  J  to  be  for  the  time  and  energy-optimal  case,  with 
the  relative  importance  between  the  two  given  by  k.  The  penalty  for  error  in  the  final 
state  is  given  by  h  where  M  is  a  constant  multiplier  to  weight  the  importance  of 

71 

reaching  the  desired  final  state, x26//  ■  We  assume  that  we  will  reach  x,  -  —  -a  no 
matter  what,  so  that  the  state  xj  is  not  included  in  the  final  cost,  h. 
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Ist  step  in  the  Poyntryagin  Maximum  Principle:  How  do  we  deal  with  this  cost?  Clearly 
it  adds  a  new  dimension  to  the  problem.  We’ll  treat  cost  as  an  additional  state ,  i.e.,  x  = 
(cost,  position,  velocity )  is  our  new  state  vector  as  shown  in  Figure  5.4. 

The  state  equations  are  now  taking  shape:  our  first  state  equation  flows  directly 
from  the  cost  integral 


x0=(/>  =  k  +  ulx]  +  u2x2  (5.28) 

Since  the  derivative  of  position  is  velocity,  our  second  and  third  state  equations  are 
easy: 


*1  =  *3 
*2=  *4 


(5.29) 


B 


Figure  5.4:  State  Trajectory  from  initial  to  final  state 
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The  fourth  and  fifth  state  equations  come  from  the  double  pendulum  equations 
of  motion.  These  equations  as  well  as  the  co-state  equations  below  are  quite  long; 
reproducing  them  here  is  rather  uninstructive,  so  the  co-state  equations  will  be  relegated 
to  the  appendix.  Here  are  the  equations  of  motion: 


2 

a2 m2 (~§)m 2  cos(0]2 )  ~  a\ g{mn) cos (0, )  + 

0^2^2(02  +  20\  #2)sin(6,2)  +  »|]  + 

2 

(tf2(-^2)-ala2w2  cos($2 ))i~a2§m2  COS(0|2)  + 

a\  02 m2 0\  ~  ( -  sin(02 )) +  u2  ] 

^  2  2  2  2  7  7  7  7 

af  02  m 2  cos  (62 ))  +  a\  aj m^  +  ctf  u5 m\ m2 


0i  = 


2  2 

[2t/2^| m2  C°s(^2 )  +  (m\2^  +  a2m2]x 

-02gm2  cos(0i2 )  +  a\a2m20\  (-sin(<92 ))  +  u 2 

2 

[tf2  (- m2 )  ~  d\ ci2m2  cos(^2 )] x 
a2  ( ~g)m2  cos{0\  2 )  -  ci\g(m\ 2 )  cos(0, )  +' 

a\a2m2{02  +20|  02)sin(02)  +  «i 


+ 
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r/|  a2  w2  (_  cos  ($2 ))  +  al  a2  w2  +  a\a7m | m7 


The  state  equations  are  thus: 


*0 

<f> 

*1 

x3 

i2 

= 

x4 

*3 

xl 

_*4_ 

_x2  _ 

(5.30) 


(5.31) 


(5.32) 


~<iid 


step  in  the  Poyntryagin  Maximum  Principle:  Introduce  the  co-state  variables 


=  [- 1  Zj  Z2  z3  z4  ]  and  form  the  Hamiltonian.  H  =  -<j>  +  z7  x  ,  so  we  are 


maximizing  the  negative  of  a  positive  number,  i.e.,  min  a  =  max  (-a)  if  a  >  0 
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H  =  ZqXq  +  ZjXj  +  Z2X2  +  Z3X3  +  Z4X4 


(5.33) 


Substituting  (5.33)  and  (5.28),  we  see  that 


H  —  —k  —  U\X2  —  u2x 4  +  Z|X3  +  z2x 4  +  ZjXj  +  z^x2 


(5.34) 


From  this  Hamiltonian  (Hocking,  1991)  the  state  and  co-state  equations  are  prescribed 


by  Hamilton’s  equations: 


dH  n 

Zq=-~—  =  0  =>  Zq=-\ 

dXQ 


dH 

Z3  =~~~k 

UX'X 


The  other  set  of  Hamilton’s  equations  are  the  state  equations: 

dH  dH  dH 

x0=  —  =  <f>  x,=  — =  x3  x2=  — =  x4  ... 

UZq  OZj  CZ2 

3"J  step  in  the  Poyntryagin  Maximum  Principle:  Solve  the  co-state  equations  above. 

We  cannot  do  this  symbolically,  so  we  will  solve  the  state  and  co-state  equations 
numerically  after  we  incorporate  the  switching  conditions  on  w,  and  u2 . 

4th  step  in  the  Poyntryagin  Maximum  Principle:  This  is  the  crucial  step  in  applying  the 


PMP.  We  choose  the  control  functions  «,  and  u2  to  give  the  greatest  possible  value  for 
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the  Hamiltonian  //,  considered  as  a  function  of  w,  andw,  respectively.  To  perform  this 


we  take  a  partial  difference,  like  a  partial  derivative  or  gradient  ,V| 


AH  _H*-H 

AW|  Mj*  —  W] 


This  is  like  the  MIT  rule,  a.k.a.  the  delta  rule:  for  maximizing,  if  the  slope  is  positive, 
keep  going  in  that  direction;  if  it  is  negative,  back  up.  Looking  at  only  zq  for  now:  If 

H*  >  H  then  H*-H>  Oand  we  find  a  switching  conditions,  such  that 

W]  -  W|  .V|  >  0  so  that  if  Sj  >  0  then  zq  — »  max ,  and  if  .v,  <  0  the  control  u \  ->  min  . 

The  control  for  u\  is  bounded  by  +1  and  0,  so  the  supremum  of  H  is  attained  when  zq 
takes  the  value  +1  when  ,v,  is  positive  and  the  value  0  when  .y,  is  negative. 


+1  for  s,  >  0 
0  for  s ,  <0 


(5.35) 


Thus  we  assume  “on-off"  control,  with  the  control  switching  between  its  extreme 
values  when  ,y,  passes  through  zero.  The  same  procedure  applies  for  the  second 
switching  condition,  s2 .  Since  these  conditions  are  based  on  the  values  of  the  state  and 

co-state,  we  will  numerically  solve  the  state  and  co-state  equations,  monitoring  the 
value  of  the  switching  condition(s). 


0  for  s2  >  0 

-1  for  .y,  <0 


(5.36) 
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5.3  Boundary  Conditions  for  PMP 


The  final  value  of  the  state  and  co-state  are  specified  by  the  following  formula 
for  the  boundary  conditions  (Kirk,  1970) 


dh 

dx 


t=T 


-z(T) 


d\j-  + 


H(T)+™ 
v  ;  dt 


t=T . 


ST  =  0 


(5.37) 


In  our  problem,  the  final  time  is  free,  so  ST  is  arbitrary,  and  therefore  its  coefficient  in 
brackets  must  go  to  zero: 


dt 


=  0 


H  (T)  + 


dt 


t=T 

( 

M 


(*/- 


x2desired 


2  "\ 


=  0 


t=T 


H\T)+2m(x2(T)~  xMed)x2’  (7>  0 


(5.38) 


The  final  state  in  our  problem  is  mixed  between  the  fixed  state:  (6) )  — >  (x| )  and  free 
states  [62 , 0\ ,  d2  )  ->  (x2 ,  ,  x4 ) :  For  the  fixed  final  state  =  0  and  the  final  value  of 

the  co-state  z\  is  arbitrary.  For  the  free  final  states,  x2  ,  x3  and  x4 ,  the  variation  <5x,  is 
arbitrary,  and  therefore  the  coefficient  of  Sxt  in  (5.37)  must  go  to  zero: 


dh 
dx  2 

dh 
dx 3 

dh 

dx4 


-r3(r)=o 

2  m[x2*{T) 

t=T 

-*3(0=0 

=>  ^3+(^)  =  0 

t=T 

-*4(0=0 

t=T 

V(r)  =  o 

x2desired 


z2 


(T)  =  0 


(5.39) 
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h  is  only  dependent  on  x2  ■  This  tells  us  that  the  final  values  for  the  co-states  z3  and  z4 

must  go  to  zero  at  the  final  time. 

Taken  together,  the  boundary  conditions  are 

H\T)+ 2M  (x2‘  (r)-  x2Jea„j)x2  (T)  =  0 

2m(x2‘ (T)-Xldeslred)-Z2  (T)  =  °  (5  40) 

*3*(r)=o 

h(t)= 0 

The  trouble  with  these  boundary  conditions  is  that  they  tell  us  what  the  final 
values  should  be,  not  what  the  initial  values  should  be.  Therefore,  to  perform  numeric 
integration  forwards,  we  must  guess  at  the  initial  values  of  the  co-states  (fortunately,  we 
already  know  the  initial  values  of  the  states).  After  numeric  integration,  we  compare 
the  final  values  to  those  given  in  (5.40).  This  error  is  the  metric  that  we  will  minimize 
using  the  Nelder  Mead  algorithm. 

5.4  Nelder  Mead  Algorithm 

Nelder  and  Mead’s  simplex  algorithm  (Nelder  &  Mead,  1965)  is  a  popular, 
time-proven  technique  for  optimizing  general  (nonsmooth)  multivariable  functions. 
Optimization  algorithms  of  this  type  are  called  direct  methods  as  opposed  to  gradient 
methods  that  rely  on  derivatives  and,  hence,  smoothness.  They  are  robust  and  general 
and  applicable  to  the  problem  of  gait  determination  by  parameterizing  the  gait  and 
defining  a  metric  over  these  parameters  for  optimization. 
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For  function  /(• )  defined  on  an  n-dimensional  domain  space,  Nelder  and 

Mead’s  method  finds  a  minimum  using  a  simplex  formed  with  n  +1  points.  The  process 
of  finding  a  minimum  proceeds  through  a  process  of  transforming  the  simplex  by  1) 
shifting  points  in  the  direction  of  a  minimum,  2)  shrinking,  and  3)  expanding.  In 
addition,  a  restarting  criterion  can  be  used.  This  restarting  criterion  uses  the  concept  of 
a  simplex  gradient.  The  Nelder-Mead  algorithm  continually  updates  a  simplex  in  the  N- 
dimensional  space  of  the  problem.  For  N=  3,  this  simplex  is  a  tetrahedron  as  illustrated 
in  Figure  5.4.  There  are  N+l  vertices  and  7V+1  corresponding  function  values. 


Pi 


Figure  5.5:  Nelder-Mead  uses  function  evaluation  over  the  vertices  of  an  N- 
dimensional  simplex  to  optimize  a  criterion  function  over  an  W-dimensional 
space.  Shown  here  is  a  tetrahedron,  which  is  a  simplex  for  N-3.  There  are  N+ 1 
vertices  in  an  yV-dimensional  simplex. 
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The  Nelder  Mead  Algorithm  consists  of  the  following  steps: 

1)  Order  the  points  from  lowest  cost  to  greatest  cost. 

2)  Calculate  the  values  of  each  simplex  point. 

3)  Try  a  new  point  to  replace  the  worst  point.  The  algorithm  attempts  each  step 
below  in  order  until  the  cost  of  the  proposed  point  is  less  than  that  of  the  worst 
point: 

•  Reflection  -  reflects  worst  point  about  the  center  of  the  simplex; 

•  Expansion  -  expands  in  the  direction  of  the  best  point; 

•  Contraction  -  shifts  worst  point  towards  the  best  point; 

•  Reduction  -  if  no  other  options  are  available,  each  simplex  point  except  the 
best  is  shrunk  towards  the  center; 

4)  Accept  the  point  once  a  step  successfully  reduces  the  cost. 

5)  Repeat  process  until  a  pre-specified  tolerance  is  met. 

Using  Nelder  Mead,  in  Figure  5.5  (a),  we  minimize  the  functional,  J  for  the 
powered  rimless  wheel  of  Chapter  4.  Since  w(t)  can  take  on  any  shape,  we  describe  it 

using  a  cubic  spline.  Splines  give  a  continuous  and  continuously  differentiable  function 
that  is  well  behaved  and  practical  to  implement.  These  spline  values  are  equally 
distributed  over  time  (20-40  points).  We  feed  the  spline  points  into  Nelder  Mead  and 
optimize  them  to  minimize  the  cost  functional,  J.  As  we  saw  in  previous  results  using 
Pontryagin’s  Maximum  Principle,  the  correct  answer  to  this  problem  at  k  =  5  is  a 
switching  curve  at  t  =  0.6s  with  a  cost  of  5.34.  Nelder-Mead  confirms  this  result. 
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This  result  was  found  after  adjusting  the  initial  simplex  parameters  and  adding  a 
sufficient  number  of  components.  One  point  to  take  away:  when  using  Nelder  Mead, 
you  must  put  an  upper  limit  on  the  input  torque:  merely  penalizing  it  with  the  square 
error  of  the  torque  amplitude  will  not  suffice.  In  Figure  5.5  (b)  we  used  Nelder-Mead  to 
optimize  the  coefficients  of  a  Fourier  series  (half-range  expansion  of  the  torque 
function)  which  found  a  similar  result  for  the  switching  curve. 

Finally,  in  Figure  5.6,  we  used  Nelder  Mead  on  spline  points  where  we  clipped 
the  output  between  0  <«(/)<  1 .  The  code  for  this  example  is  shown  in  the  appendix. 

5.5  Nelder  Mead  Optimization  and  the  Double  Pendulum 
Taking  what  we  learned  about  optimizing  the  powered  pendulum,  we  now 
tackle  the  powered  double  pendulum  with  Nelder  Mead.  We  must  stress  here  the 
importance  of  giving  Nelder  Mead  several  different  starting  values.  Otherwise  the 
algorithm  can  get  caught  in  one  of  several  local  minima.  In  this  section,  we  show  an 
example  of  Nelder  Mead  optimization  of  the  initial  conditions  on  the  co-state  variables 
with  a  metric  as  outlined  in  equation  (5.40).  As  an  example,  we  set  the  masses  equal  to 
/«j  =  1  and  m2  =  0.01 .  The  link  lengths  and  gravity  were  both  set  to  one  and  k  was  set 

7T  7T  71  71 

to  5.  The  initial  value  for  the  joint  positions  was  0\  =  — - and  02  = - .  The 

2  6  2  6 

initial  conditions  on  the  joint  velocities  were  =  1  and  #2  =  -1 .  This  is  meant  to  be  just 

one  example  of  a  selection  that  could  be  used.  With  the  high  weight  M  =  100  of  metric 
selection  we  used  in  equation  (5.40),  it  effectively  sought  to  drive  the  point-placement 
error  to  near  zero.  This  metric  selection  was  used  with  randomized  Nelder-Mead 
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(a) 


Figure  5.5:  Nelder-Mead  Optimization  using  (a)  Fourier  Series  half  range  expansion 

and  (b)  Spline  Components 
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(b) 


Figure  5.6:  Nelder  Mead  for  the  powered  pendulum  showing  (a)  the  raw  spline  curve 
(b)  control  torque  after  applying  a  hard  clip  0  <  u  <  1 . 


74 


optimization  to  find  an  optimum  stride  by  calculating  20  results  and  choosing  the 
results  with  the  smallest  metric  values.  The  results  fell  into  a  few  local  minima.  The 
first  had  no  control  torque  whatsoever,  which  resulted  in  a  large  metric  (over  80K)  and 
thus  it  was  not  considered  further.  The  second  local  minimum  actually  had  the  best 
metric  (1 .407),  meaning  that  it  was  the  best  fit  to  equation  (5.40);  however,  it  had  a 
worse  overall  cost  J=  1.857.  The  third  minimum  is  what  is  presented  here  in  Figure 
5.7,  with  a  metric  of  2.804,  but  a  cost  of  only  J  =0.064. 

This  chapter  has  shown  that  we  can  extend  the  results  of  Chapters  3  and  4  to 
solve  more  complicated  walking  problems.  Using  the  simple  rimless  wheel  model 
helped  us  gain  insight  into  problem,  now  as  we  move  towards  solving  practical 
problems,  we  need  more  realistic  models  of  walking.  Increasing  the  degrees  of  freedom 
as  shown  here  makes  the  problem  more  realistic,  but  we  must  be  aware  of  the  potholes 
along  the  way.  As  the  dimension  of  the  problem  increases,  local  minimum  can  trap  the 
Nelder-Mead  optimization  algorithm. 
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Figure  5.8:  Optimal  control  torque  w2  shown  in  red  and  the  switching  conditions^  and 
$2  shown  in  yellow  and  green  for  (a)  the  entire  time  period  and  (b)  a  close-up  of  the 
initial  time.  (This  figure  is  presented  in  color;  the  black  and  white  reproduction  may 

not  be  an  accurate  representation) 
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CHAPTER  6 


CONCLUSION 


In  this  dissertation,  we  showed  a  control  scheme  for  walking  that  decided  1) 
whether  it  made  sense  to  walk  or  not — an  energy  regime  for  walking  2)  the  optimal 
stride  angle  to  take  when  it  made  sense  to  walk  and  3)  an  optimal  switching  curve: 
when  we  should  power,  and  when  we  should  coast.  We  used  simple  models  of  walking 
behavior:  the  passive  rimless  wheel  and  the  powered  rimless  wheel,  and  we  showed 
how  this  work  could  be  extended  to  higher  dimensional  models  of  walking. 

Several  results  from  this  work,  most  notably  the  optimal  control  using 
Pontryagin’s  Maximum  Principle,  required  numeric  methods  to  implement.  In  this 
dissertation,  we  showed  how  to  use  Nelder  Mead  optimization  to  reduce  the  number  of 
variables  for  optimization  down  to  the  initial  conditions  of  the  co-state  equations.  This 
approach  is  highly  parallelizable  because  you  need  different  random  starting  conditions 
for  Nelder-Mead. 

For  future  work,  we  would  like  to  compare  these  results  to  various  classic 
descriptions  of  inverted  pendulum  walking  (Alexander,  2003;  Kuo,  2002;  Kuo  et  al. 
2005;  Ruina  et  al.  2005)).  These  models  incorporate  nondimensional  parameters  that 
would  be  well  suited  to  comparison  here. 
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Another  avenue  for  future  work  is  to  incorporate  collision  costs  in  these  results. 
(Srinivasan  2006)  shows  that,  for  a  simple  model,  inverted  pendulum  walking  is 
preferred  over  all  other  possible  gaits  at  low  speed  even  when  collision  is  included. 

A  key  result  of  this  thesis  was  the  optimal  switching  curve  in  phase  space  for  the 
powered  rimless  wheel.  For  this  simple  model,  phase  space  was  a  plane,  and  the 
switching  curve  was  a  one-dimensional  curve  separating  phase  space  into  distinct 
regions  (power  and  coast).  What  would  this  “switching  curve"’  look  like  in  higher 
dimensional  systems?  Even  if  the  full  “switching  curve’"  could  not  be  described 
analytically,  could  you  describe  it  for  a  local  region  of  phase  space?  This  result  would 
be  useful  to  an  off-road  robot,  where  the  initial  conditions  for  the  gait  are  constantly 
changing. 
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APPENDIX  A 

DYNAMIC  PROGRAMMING  CODE 
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%This  program  figures  out  the  energy  cost  for  a  powered  pendulum 
%Where  the  goal  is  to  reach  a  specific  ANGLE 

%Nodes  are  in  terms  of  (Energy, theta)  which  is  then  used  to  find  thetad 
%A11  nodes  are  below  the  separatrix  (cw  rotations— whirling) 

%we  first  build  an  energy  contour  map 

delta_theta=-0.01; 

delta_E=0.125; 

n=4;%n+l  is  the  number  of  energy  levels 
%the  walker  is  moving  ccw  (thetad  is  negative) 

%We  plot  the  different  energy  contours  of  the  simple  pendulum 

[theta, E]  =  meshgrid(pi+0.5:delta_theta:pi-0.5,  1 .01  :delta_E:  1 .01+n*delta_E); 

%thetad  as  a  function  of  (Energy,  theta) 
thetad=-sqrt(2)*((E+cos(theta)).A0.5);  %Strogatzp.  170 

%and  now  we  plot  the  energy  contour  map  on  theta, thetad  state-space 
%mesh(theta, thetad, E); 
hold  on; 

v=[1.01  1.125  1.25  1.375  1.5];  %other  Energies 

[C  ,h] =contour(theta,thetad,E,  v,  'k-' ) ; 

clabel(C,h,'manual');  %in-line  manual  labeling 

xlabel('theta'); 

ylabel('thetad'); 

axis  equal 


%Now  we  need  an  energy  and  time  cost  matrix 

%find  the  time  cost  per  iteration 
delta_t=delta_theta  ./thetad; 

%Goal  is  Energy  level  5 
%Start  at  Energy  level  1 


%Each  angle  has  5  energy  levels  (5  nodes) 

%At  any  given  time  we  have  two  choices: 

%We  can  stay  on  the  same  energy  contour  (denoted  by  2  in  the  tree) 
%or  move  up  to  the  next  energy  contour  (denoted  by  3  in  the  tree) 
%The  final  option  is  where  there  is  no  path  (denoted  by  1  in  the  tree) 
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%First  column  is  the  time  cost  alone  (no  E  input) 

%Second  column  is  the  cost  of  energy  input  (0.25)  +  time  cost 

for  i  =  1 :5*length(delta_t)  %5  x  column  length 
Cost(i,l)  =  delta_t(i); 

Cost(i,2)  =  delta_E  +  delta_t(i); 
end 

%Every  5th  node  is  at  the  max  energy  level 
%NaN  indicates  a  non-existant  path 
%NaN  is  ignored  by  the  min()  function 

for  i=l  :length(delta_t) 

Cost(5*i,2)=NaN;  %Max  Energy  level  is  5 
end 


%t(i,j)  from  Cost(m,n)?  i=mJ=m+n+4 
%f(i)=min_j  (t(i,j)  +  f(j)}  %see  Denardo 

f(501)=0;  %The  entire  column  of  theta=pi-0.5  is  the  goal 

f(502)=0;  %hence  we  need  five  zero  cost  nodes 

f(503)=0; 

f(504)=0; 

f(505)=0; 

f(506)=NaN;  %for  computational  compatibility 

f(507)=NaN; 

f(508)=NaN; 

f(509)=NaN; 

f(510)=NaN; 


%Tree  of  Shortest  Paths 

clear  Total  Cost; 

for  m  =  500 :  - 1 : 1  %decrement 

for  n  =  1:2  %there  are  two  choices  per  node 

Total_Cost(m,n)=Cost(m,n)+f(m+n+4);  %total  cost  (NaN  +  f  =  NaN) 
end 

f(m)=min(Total_Cost(m,:));  %need  to  update  f 
end 
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Na=NaN(500,l);  %NaN  is  used  to  denote  a  nonexistant  path 
Total_Cost=cat(2,Na,Total_Cost);  %concatenate  Total  Cost  with  a  row  of  NaNs 
[f2,k]=min(Total_Cosf );  %there  is  a  min  path  choice/node 
k(505)=l; 

%min  path  tree 

for  n=l  :length(delta_t)  %column  length 
for  m=  1:5  %rows  then  columns 

K(m,n)=k(5*(n-l)+m);  %5*(n-l)  is  the  #  of  the  cells  already  filled 
end 
end 

%Now,  from  the  min  path  tree,  I  would  like  to  build  a  vector  map 
%that  shows  the  direction  of  the  preferred  flow  for  the  powered  pendulum 
%We  need  to  calculate  v,u  the  rise  and  run  of  the  vectors  at  each  point 
%the  run,  u,  is  easy  it  is  simply  delta_theta 
%we  must  make  a  matrix  of  delta  thetas 

u=ones(5,101)*delta_theta;  %vector  data  for  the  run 
for  n=l  :length(delta_t) 
for  m=T:5 

if  K(m,n)==3  %moving  up  to  the  next  energy  contour 
v(m,n)=thetad(m+ 1  ,n+ 1  )-thetad(m,n); 
end 

if  K(m,n)==2  %staying  on  the  same  energy  contour 
v(m,n)=thetad(m,n+ 1  )-thetad(m,n); 
end 

if  K(m,n)==T  %no  path 
u(m,n)=0; 
v(m,n)=0; 
end 
end 
end 

%plot  the  vector  map— every  fifth  element  so  not  too  many  arrows 
%quiver(  theta,  thetad,u,v); 

quiver(theta(:,l  :5 : 1 0 1  ),thetad(:,  1 :5 : 1 0 1  ),u(:,  1 :5 : 1 0 1  ),v(:,  1 :5 : 1 0 1 )); 
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APPENDIX  B 

NELDER  MEAD  OPTIMIZATION  OF  THE  POWERED  RIMLESS  WHEEL 
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(*  This  program  uses  Nelder  Mead  to  optimize  the  torque  profile  to  move  a  pendulum 
from  pi-a  to  pi+a  with  free  final  time  *) 

(*  number  of  torque  spline  points  *) 
numPoints  =11; 

(*  torque  values  *) 
torques=.; 

(*  time  increment  of  the  spline  points  *) 
deltaT  =  0.2; 

(*time  values  *) 

times  =  Table[deltaT*(ii-l),{ii,l,  numPoints}] 


{0,0.2, 0.4, 0.6, 0.8, 1  .,1.2, 1 .4, 1 .6, 1.8,2.} 


(*  define  the  input  function  as  spline  values  over  the  data  *) 
dataTable[xTimes_,xTorques_]:=Table[{xTimes[[ii]], 
xT  orques  [  [i  i]] } ,  { ii ,  1  ,Length[xT  orques] }  ] ; 

(*  u  is  the  interpolation  of  the  spline  points  *) 

u[t_,xTimes_,xTorques_]:=Interpolation[dataTable[xTimes, 

xTorques],Method->"Spline”][t]; 

(*  u  is  the  interpolation  of  the  raw  spline  data  that  we  feed  into  Nelder  Mead  *) 

(*  Clip  will  limit  u  between  0<u<l  *) 

(*  define  the  pendulum  equation  with  CLIPPED  u  (t)  between  0  and  1  *) 
thetaDD[t_,xTimes_,xTorques_]:=  -Sin[y[t]]  +  Clip[u[t,xTimes,xTorques],{0,l]]; 

(*  define  the  metric  to  be  minimized,  i.e.  J  =  E+kT  *) 
a  =  Pi/6;  (*  half  the  step  width  *) 
thetaFinal=  Pi+a; 

k=5;  (*  the  relative  importance  of  time,  where  0<k<inf  *) 
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metricW[torques_?(VectorQ[#,NumericQ]&),times_,thetaO_, 

thetadotO_,tf_]:= 

Block[{}, 

Off[NDSolve::"mxst"]; 

(*  msoln  finds  the  solution  for  more  time  than  we  need,  i.e.,  tf_  is  our  guess  at 
the  final  time.  We  then  use  findRoot  to  find  where  theta  (tfinal)  =  thetaFinal. 
We  then  truncate  the  integration  for  the  pathMetric  between  0  and  tfinal  *) 

msoln:=NDSolve[{y"[t]==thetaDD[t,  times,  torques].  y[0]==theta0, 
y’[0]==thetadot0},y,{t,0,tf}]; 

(*  find  the  final  time  where  theta  =  thetaFinal.  start  the  guess  at  t  =  0.5  *) 
finalTime=FindRoot[Evaluate[y[t]/.msoln]-thetaFinal,{t,0.5}] ; 

(*  finalTime  is  not  a  number,  it  is  something  like  (t->2. 33455}  *) 
tfinal=t/.finalTime  ; 

(*  Here  was  the  hard  part  of  the  pro  gram...  finding  the  right  pathMetric  that 
Mathematica  could  handle.  We  wanted  to  use  pathMetric=NIntegrate[u[t]*y'[t]]. 
which  took  a  while  to  figure  out  the  right  format  *) 

(*  After  a  lot  of  guessing,  we  finally  found  something  that  works  *) 

(*  The  following  yielded  a  cost  of  5.32899  with  a  pronounced  switch  at  around 
t  =  0.6,  and  tfinal  =  0.944  *) 

pathMet=NIntegrate[Clip[u[t, times, torques], {0, 1  }]* 
Evaluate[y'[t]/.msoln],{t,0.001,tfinal-0.001  ]]; 

(*  when  using  Evaluate,  we  need  to  define  pathMetric  =  pathMet[[l]],  otherwise 
use  pathMetric.  pathMet  is  not  a  number,  it  is  something  like  {4.32345}  *) 

pathMetric=pathMet[[l  ]]; 

(*  u*thetadot  where  u  is  clipped  b/t  0  and  1  *) 

(*  this  is  the  metric  for  Nelder  Mead  to  evaluate  *) 

(pathMetric+k*  tfinal) 

];  (*  end  of  Block  *) 


(*  perform  Nelder  Mead  optimization  on  spline  points  *) 

xx  =  {x0,xl,x2,x3,x4,x5,x6,x7,x8,x9,xl0}; 
initialPosition  =  Pi-Pi/6; 
initial  Velocity  =  0.8; 

(*  E0  =  1.186  *) 

finalTimeGuess  =  2;  (*  overshoot  the  final  time  to  include  underpowered  guesses  *) 
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sln=NMinimize[ 

{metric W[xx, times, initialPosition,initialVelocity,finalTimeGuess] } ,  xx, 
■  Method->  { "N  elderMead " , "  Random  Seed"  ->  1 }  ] 


{5.32899,{x0->63.5233,xl->18.3641,x2->24.3306,x3->1.79457,x4->-34.5851,x5->- 
50. 0206, x6->-42. 955, x7->30. 234, x8->-2.19082,x9->8. 39748, xl0->-4. 77845}} 

Row[Plot[u[t, times, xx/.sln[[2]]],{t,0,finalTimeGuessj],Plot[Clip[u[t, times, xx/.sln[[2]]], 
{0,1  }],{t,0,tflnal}]] 

tfinal  (*  Clip  will  limit  u  between  -1  and  1  *) 


0.944315 

thetasoln=NDSolve[{y"[t]==thetaDD[t,times,xx/.sln[[2]]]. 

y[0]==initialPosition,y'[0]==initialVelocity},y, 

{t,0,  tfinal}]; 

Plot[Evaluate[y[t]]/.thetasoln,{t,0, tfinal}] 


7.0  Pi/6 
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3.66519 


gg[tJ:=Graphics[{ 

{ 

{Gray,Rectangle[{-.9,-.8},{-l,-.8  +  Clip[u[t,  times,  xx/.sln[[2]]],  {0,1 }]}]}, 
{Thick,Red,Line[{{-.9,-.8},{-l,-.8}}]}, 

Thick, Line[  { {0,0} ,  { Sin[th]  ,-Cos[th] } }  ] } , 

Disk[{Sin[th],-Cos[th] }  ,0. 1  ] 

}  ,PlotRange->  { { - 1 . 1 , 1 . 1 } ,  { - 1 . 1 , 1 . 1 } }  ]/.th-> 

Evaluate  [y  [t]]/.thetasoln  [[  1  ]  ] ;  Animate  [gg  [t] ,  { t,0,tfinal }  ] 


87 


APPENDIX  C 

STATE  AND  CO-STATE  EQUATIONS  FOR  THE  DOUBLE  PENDULUM 
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(*  Mathematica  notebook  file  *) 


M 1 1  =  (/??(  +  /772)c/|  A 2  +  /?72c/2  A 2  +  2w2c/| £72Cc7,y[02 ] 

9  2 

2a-)aym'j  cos(#2)  +  c/f  (/77(  +  7779  )  +  c/2  7772 

Ml 2  =  w2a2  A  2  +  7H2tf|ff2Cc«[02 ] 

2 

<^\a2m2  COS(02)  +  £/2/?72 

M2 1  =  7772c/2  A  2  +  w2tf] c/2C o.v[#2 ] 

9 

c/|£72t?72  cos(02)  +  c/2/772 
M22  =  7772c/2  A  ^ 

? 

c/2  m2 

MM  =  {{Ml  1,M12},{M21,M22}) 

9  2  9 

(/77j  +  7772)c/p  +2cos(6,2)«2w2£/I  +a2m2  m2a2  +  COS(<92  )^|  m2a2 

2  9 

7772fl2  +COS(^2)fl|/772C/2  £/277?2 

MM/  =  Inverse[MM ] 


a\mi 

2  111  999  2  9 

-COS  (#2)c/|  7772£/2  +  £/|  7772£/2  +  £/|  777|7772a2 

9 

-/t?2c/2  -cos(6,2)«|7772a2 

2  2  2  ^  2  9  2  2  2 

-COS  (#2  )c/p  7772  C/2  +  £/|  7772C72  "^^1  ^  |77?2£/2 


-/772c/2  —  cos(t?2  )ci\ft%2a2 

9  999  111  1  9 

-C0S"(<92  )af/7?2C/2  +£/f/772£72  +£7f77?]7772£72 

/I  9 

(777|  +/772)t/p  +2cOS(^2)£72/?72£/|  +  C/2  7772 

9  111  9  2  2  9  9 

-  COS"  (6*2  )c/f  m2  a2  +  £/f  7772  C/2  +  C/p /77|  7772C/2 


fl  = -7772C/|C/2,S'/77[^2](26>|  02  +  02^2) 

c/)C/27772(^22  +  2#i  #2)(-sin(#2)) 

G1  =  (777|  +  7772  )gC/]Cc7i'[0|  ]  +  7772ga2Ctt.v[<?|  +  #2  ] 
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a2Sm2  cos(6*|  +  6*2 )  +  ciy  g(//7|  +  m2 )  cos(<9( ) 

V2  =  m2aya2Sin[02]0\ A  2 
•  2 

a\a2m20\  sin(#2) 

G2  =  m2ga2Cos[0 1  +  02 1 
a2gm2  cos(6|  +  02 ) 

(*  Coriolis  and  Gravitational  Torques  *) 

7T  =  {V1  +  G1,V2+G2) 

{ a2gm2  cos(6>|  +02)  +  ciyg(my  +  m2)cos(0\ )  -  aya2m2{02  +20\02  )sin(6*2 ), 

•  7 

a2gm2  cos(<9|  +  02)  +  aya2m20y~  sin(#2)) 

(*  State  Eqn  M.qdd+V+G=tau 
qdd=MMI.(tau-V-G)  *) 

theta  DD  =  MM  I  .(-7T  +  {u  y,u2)) 

7 

a2  m2  {a 2  {-g)m2  cos(6*|  +  02 )  -  ay  g{my  +  m2  )  cos(6*| ) 

T  9  9  9  9  9  9  9  9  *~ 

r/|  a2m2(-cos  ( 02))  +  ay  a2m2  +afa2mym2 

2  •  9 

(a2(-m2)-aya2m2  cos (6*2))(-t/2g77?2  005(6*1  +  62)  +  c/|£/2/7726'1“(-sin(6*2))  +  777) 

9  9  2  9  9  2  9  9  9 

r/|  a2/772  (-cos“(62 ))  +  a[ a2m2  +  aycijinyin-) 

•  7  ' 

a\a2m2{02  +20\  62)sin(62)  +  W|) 

22?  9  i  i  i  22 

t/|  a2  /772  (-  cos  (02  ))  +  £7|  a2  m2  +  a\  ci 2 my  m2 
2  2 

{2a2ay  m2  cos(02 )  +  a |  (/7?i  +  /772  )  +  a2 m2  )(-ci2gm2  cos(6*|  +  6*9 ) 

777  7  99799 

a[ «9w2 (— cos~(62 )) +  a\  a2m2  +ci\ci2my,n2 

9 

(C72(-W2)-C/|£/27772  008(6*9  ))(C72(-»-)/772  COS(6*|  +  02  )  -  Clyg{niy  +  7772  )  COS(6*|  ) 

222  2  9  2  9  9  9 

(9 1  c/2  7777  (-  cos  (62 ))  +  c/f  a 2  m2  +  u[ a2  my  m2 

‘2  •  ? 
aya2m2(02  +20\  02)s\n{02)  +  Uy)  + aya2m20y  (~sin(62))  +  772) 

22?  2  111  29 

r/|  C727772  (-cos  (69  ))  +  af  c/2  a?72  +  <-hm\m2 
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(*  write  thetaDD  in  state  space  form  in  preparation  for  symbolic 
calculation  of  the  Hamiltonian  *) 


thetaDDS  =  thetaDD /.  {6\  -»  x( .02  — »  x2.0\  — >  x2 , 62  — >  x4 } 


(* 


T 

The  cost  is  given  by  :  ./  =  (x  fimd  -  xdesjrcd  )2  +  +  W|  6\  +  u2  02  at 

0 


thus 


h  =  (x2  -xdf 


and 


~xo~ 

X1 

<h 

x2 

= 

Ol 

x3 

0\ 

_x4 . 

e2 

The  Hamiltonian  is  given  by: 

H  =  zq  xo  +  z\  x\  +  z2  X2  +  z2  X3  +  z4  X4 

M  ■  — (j)  +  Z|  $|  +  Z-)  61  +  z2  0\  +  z4  61 
H  =-k  -  »|X3  -  »2-y4  +  -1-X3  +  -2x4  +  z3  x3  +  r4  x4 
*) 

H  =  -k  -  W|X3  -  «2-y4  +  z)x3  +  Z2X4  +  z2thetuDDS[[\\\  +  z4thetaDDS[[2]\ 
(*  CoState  Equations  *) 

z\dot  =  -D[H,x\] 

zldol  =  -D[H.x  2] 
z3dot  =  -D[H,x2] 
z4dot  =  -£>[//,  x4] 

(*  above  output  suppressed  for  brevity  *) 
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HI  =H/.{u]  -»wl*} 

HI  =H/.{u2  —>ii2*} 

(* 

switching  condition  is  the  PMP.  i.e.  the  supremum  for  H 

if  H*  >  H  then  H*  -  H  >  0 
♦ 

and  (u  -u)a>  0 
thus  if  a  >  0 

then  u  — >  max 
and  if  a  <  0 

then  u  — »  min 
*) 


■  c-  ....  rSiinplify\H\  -//] 
switch]  =  Simpliffl - - f] 

u\  —ll\ 

-a2<7fx3(w2  sin  (x2)  +  ni\  )-a]z4  cos(x2)  +  t/2(z3  ~za) 

2  7  ~r"' 

ci\  a2  ( m2  sin-  (x2 )  +  m\ ) 

.  f„  Simplify[H2  -  H]t 
switch2  =  Simplify^ - . . -] 

u2  -it2 

2  2  *2 

ct\  ((W|  +  m2  )z4  -  a2  m2x4  (m2  sin-  (x2 )  +  m\ ))  -  u2u\ rn2  (z3  -  2z4 )  cos(x2 )  + . . . 

...  +  a2m2(z4  -z3) _ 

a\ a2  in 2  (m2  sin  (x2 )  +  /wj ) 


(* 

the  switching  condition  given  above  defines  where  the  control 
torque  U\  switches.  We  now  define  the  switching  condition  in  terms  of  numerical 
values  in  preparation  for  numerical  integration  of  the  state  and  costate  eqns.  If  you 
change  numeric  values,  don't  forget  to  change: 

1)  switch  condition  s[t_J  here 

2)  stateDDN  and  costateDDN 

3)  graphics 
*) 

HN  :=  H  /  .{x  |  — >xl[r],x2  — »x2  |/],x3  — »xl'[/],x4  — >x2'[/]. 

Z|  ->zl[/],z2  — »z2[/],z3  ->z3[/],z4  — »z4[/], 
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k  — »•  5, 

£/|  — ^  1. 

£J2  ^  l  - 
m\  — >  1, 
m2  — »  0.01, 

g->l, 

i/|  — >  ulslar, 
u2  — >  u2star } 

.vl  =  switch 1  /  .{x|  — »  xl[/],x2  — >  x2[/],x3  — >  xl'[/],x4  — >  x2'[/]. 
z\  — >  zl|7 ],  z2  — >  z2[f],  z3  — >  z3[t ],  z4  — >  z4[Y], 

CY|  — » 1, 

^  1, 
m\  ->1, 
m2  — ^  0.01, 

*->1} 

.v2  =  switch!  / . {X|  — »xl[/],x2  — >x2[/],x3  -»xl'[/],x4  — >  x2 '[/]. 

Z\  — >  z\[t],z2  — >  z2[/],z3  — >  z3[/ ],z4  — >  z4[/], 
a\  -»  1, 

a2  — > 

«7|  — >  1, 

m2  — >  0.01, 

g-*l} 

(*  The  control  torque  ustar[t_]  will  switch  from  "high"  to  "low"  based  on  the 
conditional  statement  s[t]  shown  above;  if  you  would  like  to  use  zero  control  torque,  try 
0.  instead  of  0 

Because  Simplify  used  u~  u*  we  must  switch  like  this 

if  si  >0 
* 

then  u  1  — »  max 

if  .vl  <  0 

* 

then  wl  — » min  *) 

ills  tar  :=  Piecewise[{  {0.,.vl  <=  0.},{l...vl  >0.}}] 
u2star  :=  Piecewise[{{-\ .,s2  <=  0.},{0..  s2  >0.}}] 
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(*  Substitute  numerical  values  in  the  state  eqn  preparation  for  numerical  integration  *) 


stateDDN  =  thetaDD  /  .{0\  — »  xl [t],02  -»x2[7],#i  — >  x\'[t].02  — >x2'[/]. 

<7|  —>  1. 

W|  ->  1, 
m2  — >  0.01, 

ii\  — >  uhtar, 
u 2  — »  u2star 

i 

(*  include  numbers  for  the  different  masses,  lengths,  etc.,  in  preparation  for  numeric 
integration  if  you  change  numeric  values,  don't  forget  to  change  for  the 

1) switch 

2)  stateDDN  and  costateDDN.  and 

3) Graphics!  *) 


costateDD  =  {z\dot,z2dot,  zsdot,  z4dot} : 

costateDDN  =  costateDD  /  .{jcj  — >  xl[/], x2  — >  x2[t],  x3  — >  xl '[/]. x4  — »■  x2 '[/], 
Z\  — >  zl[t],z2  — >  z2[/],z3  — >  z3[/],Z4  — >  z4[/], 
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Cl\  -» 1, 

a2  — >  1, 

m\  ->  1, 

^2  — >  0.01, 

W]  — »  ulstar, 
u  2  — >  uls  tar 
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APPENDIX  D 

NELDER  MEAD  OPTIMIZATION  OF  THE  DOUBLE  PENDULUM 
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(*  Mathematica  notebook  file  *) 

(*  This  code  is  a  Nelder  Mead  optimization  of  the  initial  co-state  values  for  a  double 

pendulum.  It  uses  the  symbolic  results  for  state  and  co-state  equations  as  well  as  the 

Hamiltonian  and  switching  conditions  from  Appendix  C  *) 

x2desired  =  -3Pi/2+Pi/6; 

metricZ[zz  ?(VectorQ[#.NumericQ]&)]  := 

Block[{}, 

OfffNDSolve:  fmxst"] : 

msoln  :=  NDSolve[ 

( 

i 

xT'IXI  ==  stateDDN[[l]J.  x2"[t]  ==stateDDN[[2]J. 

zl’[t]  ==  costateDDN[[l]]. 

z2 ' [t]  ==  costateDDN[[2]], 

z3"[t]  =  costateDDN[|3]J. 

z4'[t]  ==  costateDDN[[4]]. 

(*  initial  conditions — if  you  change  here,  change  in  soln2  *) 

xl  [0]  ==Pi/2-Pi/6.  x2[0]  ==  -Pi/2-Pi/6,  xl*[0]  ==  1.  x2'[0]  ==  - 

(*  Nelder  Mead  must  adjust  these  co-stae  initial  conditions  *) 

zl[0]  —  zz[[l]], 

z2[0]  ==  zz[[2]], 

z3[0]  =  zz[[3]], 

z4[0]  ==  zz[[4]]  }. 


(*  it  is  critical  to  cover  the  entire  range  for  interpolation,  rather  than  extrapolation  *) 


{xl,x2,zl.z2,z3,z4},{t.0,2} 

];  (*  end  of  NDSolve  *) 

(*  find  the  final  time,  i.e.,  the  time  where  thetal  =  Pi  +  a  *) 

T  =  FindRoot[Evaluate[xl  [t]/.msoln]  -  (Pi/2  +  Pi/6),  {t.l }]; 
tfinal  =  t/.T: 

(* 

the  metric  for  Nelder  Mead  to  minimize  is  the  boundary  conditions  that  PMP  must 
satisfy,  final  time  is  free: 

Of 

final  states  x3  and  x4  are  free 
z3(T)  =  0 
z4(T)  =  0 

and  final  state  x2  is  free  (with  penalty): 

z2{T)  =  =  2M(x2f  -  x2 d )  / .  {/  ->  tf } 

OX  7 

the  only  final  state  that  is  fixed  is  xl,  thus  zl(T)  is  arbitrary. 

*) 

z2f  =  (z2[tfinal]  /.  msoln)[[l]]; 
z3f  =  (z3[tfmal]  /.  msoln)[[l]]; 
z4f  =  (z4[tfmal]  /.  msoln)[[l]]; 

HNf  =  (HN  /.  msoln  /.  t  ->  tfinal)[[  1  ]]; 
x2df  =  (x2;[tfmal]  /.  msoln)[[l]]: 
x2f  =  (x2[tfinal]  /.  msoln)[[l]]; 

(*  weighing  the  final  error  a  little  higher  *) 
dhf  =  200  x2df  (x2f  -  x2desired); 

(*  xl  fixed,  we  force  it  to  be.  but  x2  is  free:  we  allow  it  to  vary  (at  a  cost,  h)  *) 
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dh2f  =  200  (x2f  -  x2desired); 

(z3fA2  +  (z2f-  dh2f)A2  +  (HNf  +  dhf)A2  +  z4fA2) 

];  (*  end  of  Block  *) 

(*  Perform  Nelder  Mead  optimization  on  the  guesses  at  the  initial  values  for  the  co¬ 
states  *) 

zz  =  {zzl.  zz2,  zz3,  zz4}: 
sin  =  NMinimize[ 

{metricZ[zz]}.  zz,  Method  ->  {"NelderMead".  “RandomSeed"  ->  8)] 

(*  change  the  random  seed  to  get  different  results  *) 

{2. 80382, {zzl  ->0.78843 1  ,zz2->-0.583 1 46,zz3->-0. 1 04222, zz4->-0.0 1 57834} } 
(*  display  the  results  of  the  Nelder  Mead  optimization  *) 
tfinal 

soln2  =  NDSolve[ 

{ 

xl”[t]  ==  stateDDN[[l]].  x2”[t]  ==  stateDDN[[2]], 

zl’[t]  ==  costateDDN[[l]], 

z2’[t]  ==  costateDDN[[2]], 

z3’[t]  ==  costateDDN[[3]], 

z4?[t]  —  costateDDN[[4]], 

(*  initial  conditions  must  match  those  in  msoln  *) 
x  1  [0]  ==  Pi/2  -  Pi/6,  x2  [0]  =  -Pi/2  -  Pi/6,  x  1’  [0]  ==  1 . 
x2’[0]  =  -1, 
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(*  Nelder  Mead  must  adjust  the  co-state  initial  conditions  *) 


zl  [0]  ==zzl  /.  sln[[2]], 

z2[0]  ==  zz2  /.  sln[[2]], 

z3[0]  =  zz3  /.  sln[[2]], 

z4[0]  =  zz4  /.  sln[[2]], 

\ 

)  5 

{xl,  x2,  zl,  z2,  z3,  z4},  {t,  0,  2} 

] 
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